- 22 Jul, 2018 40 commits
-
-
Marc Zyngier authored
commit 5d81f7dc upstream. Now that all our infrastructure is in place, let's expose the availability of ARCH_WORKAROUND_2 to guests. We take this opportunity to tidy up a couple of SMCCC constants. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit b4f18c06 upstream. In order to forward the guest's ARCH_WORKAROUND_2 calls to EL3, add a small(-ish) sequence to handle it at EL2. Special care must be taken to track the state of the guest itself by updating the workaround flags. We also rely on patching to enable calls into the firmware. Note that since we need to execute branches, this always executes after the Spectre-v2 mitigation has been applied. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit 55e3748e upstream. In order to offer ARCH_WORKAROUND_2 support to guests, we need a bit of infrastructure. Let's add a flag indicating whether or not the guest uses SSBD mitigation. Depending on the state of this flag, allow KVM to disable ARCH_WORKAROUND_2 before entering the guest, and enable it when exiting it. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit 85478bab upstream. As we're going to require to access per-cpu variables at EL2, let's craft the minimum set of accessors required to implement reading a per-cpu variable, relying on tpidr_el2 to contain the per-cpu offset. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit 9cdc0108 upstream. If running on a system that performs dynamic SSBD mitigation, allow userspace to request the mitigation for itself. This is implemented as a prctl call, allowing the mitigation to be enabled or disabled at will for this particular thread. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit 9dd9614f upstream. In order to allow userspace to be mitigated on demand, let's introduce a new thread flag that prevents the mitigation from being turned off when exiting to userspace, and doesn't turn it on on entry into the kernel (with the assumption that the mitigation is always enabled in the kernel itself). This will be used by a prctl interface introduced in a later patch. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit 647d0519 upstream. On a system where firmware can dynamically change the state of the mitigation, the CPU will always come up with the mitigation enabled, including when coming back from suspend. If the user has requested "no mitigation" via a command line option, let's enforce it by calling into the firmware again to disable it. Similarily, for a resume from hibernate, the mitigation could have been disabled by the boot kernel. Let's ensure that it is set back on in that case. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit 986372c4 upstream. In order to avoid checking arm64_ssbd_callback_required on each kernel entry/exit even if no mitigation is required, let's add yet another alternative that by default jumps over the mitigation, and that gets nop'ed out if we're doing dynamic mitigation. Think of it as a poor man's static key... Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit c32e1736 upstream. We're about to need the mitigation state in various parts of the kernel in order to do the right thing for userspace and guests. Let's expose an accessor that will let other subsystems know about the state. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit a43ae4df upstream. On a system where the firmware implements ARCH_WORKAROUND_2, it may be useful to either permanently enable or disable the workaround for cases where the user decides that they'd rather not get a trap overhead, and keep the mitigation permanently on or off instead of switching it on exception entry/exit. In any case, default to the mitigation being enabled. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit a725e3dd upstream. As for Spectre variant-2, we rely on SMCCC 1.1 to provide the discovery mechanism for detecting the SSBD mitigation. A new capability is also allocated for that purpose, and a config option. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit 5cf9ce6e upstream. In a heterogeneous system, we can end up with both affected and unaffected CPUs. Let's check their status before calling into the firmware. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit 8e290624 upstream. In order for the kernel to protect itself, let's call the SSBD mitigation implemented by the higher exception level (either hypervisor or firmware) on each transition between userspace and kernel. We must take the PSCI conduit into account in order to target the right exception level, hence the introduction of a runtime patching callback. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
commit eff0e9e1 upstream. We've so far used the PSCI return codes for SMCCC because they were extremely similar. But with the new ARM DEN 0070A specification, "NOT_REQUIRED" (-2) is clashing with PSCI's "PSCI_RET_INVALID_PARAMS". Let's bite the bullet and add SMCCC specific return codes. Users can be repainted as and when required. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christoffer Dall authored
Commit 4464e210 upstream. We already have the percpu area for the host cpu state, which points to the VCPU, so there's no need to store the VCPU pointer on the stack on every context switch. We can be a little more clever and just use tpidr_el2 for the percpu offset and load the VCPU pointer from the host context. This has the benefit of being able to retrieve the host context even when our stack is corrupted, and it has a potential performance benefit because we trade a store plus a load for an mrs and a load on a round trip to the guest. This does require us to calculate the percpu offset without including the offset from the kernel mapping of the percpu array to the linear mapping of the array (which is what we store in tpidr_el1), because a PC-relative generated address in EL2 is already giving us the hyp alias of the linear mapping of a kernel address. We do this in __cpu_init_hyp_mode() by using kvm_ksym_ref(). The code that accesses ESR_EL2 was previously using an alternative to use the _EL1 accessor on VHE systems, but this was actually unnecessary as the _EL1 accessor aliases the ESR_EL2 register on VHE, and the _EL2 accessor does the same thing on both systems. Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
Commit 44a497ab upstream. kvm_vgic_global_state is part of the read-only section, and is usually accessed using a PC-relative address generation (adrp + add). It is thus useless to use kern_hyp_va() on it, and actively problematic if kern_hyp_va() becomes non-idempotent. On the other hand, there is no way that the compiler is going to guarantee that such access is always PC relative. So let's bite the bullet and provide our own accessor. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marc Zyngier authored
Commit dea5e2a4 upstream. We've so far relied on a patching infrastructure that only gave us a single alternative, without any way to provide a range of potential replacement instructions. For a single feature, this is an all or nothing thing. It would be interesting to have a more flexible grained way of patching the kernel though, where we could dynamically tune the code that gets injected. In order to achive this, let's introduce a new form of dynamic patching, assiciating a callback to a patching site. This callback gets source and target locations of the patching request, as well as the number of instructions to be patched. Dynamic patching is declared with the new ALTERNATIVE_CB and alternative_cb directives: asm volatile(ALTERNATIVE_CB("mov %0, #0\n", callback) : "r" (v)); or alternative_cb callback mov x0, #0 alternative_cb_end where callback is the C function computing the alternative. Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
James Morse authored
Commit 1f742679 upstream. Now that a VHE host uses tpidr_el2 for the cpu offset we no longer need KVM to save/restore tpidr_el1. Move this from the 'common' code into the non-vhe code. While we're at it, on VHE we don't need to save the ELR or SPSR as kernel_entry in entry.S will have pushed these onto the kernel stack, and will restore them from there. Move these to the non-vhe code as we need them to get back to the host. Finally remove the always-copy-tpidr we hid in the stage2 setup code, cpufeature's enable callback will do this for VHE, we only need KVM to do it for non-vhe. Add the copy into kvm-init instead. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
James Morse authored
Commit 6d99b689 upstream. Now that KVM uses tpidr_el2 in the same way as Linux's cpu_offset in tpidr_el1, merge the two. This saves KVM from save/restoring tpidr_el1 on VHE hosts, and allows future code to blindly access per-cpu variables without triggering world-switch. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
James Morse authored
Commit c97e166e upstream. Make tpidr_el2 a cpu-offset for per-cpu variables in the same way the host uses tpidr_el1. This lets tpidr_el{1,2} have the same value, and on VHE they can be the same register. KVM calls hyp_panic() when anything unexpected happens. This may occur while a guest owns the EL1 registers. KVM stashes the vcpu pointer in tpidr_el2, which it uses to find the host context in order to restore the host EL1 registers before parachuting into the host's panic(). The host context is a struct kvm_cpu_context allocated in the per-cpu area, and mapped to hyp. Given the per-cpu offset for this CPU, this is easy to find. Change hyp_panic() to take a pointer to the struct kvm_cpu_context. Wrap these calls with an asm function that retrieves the struct kvm_cpu_context from the host's per-cpu area. Copy the per-cpu offset from the hosts tpidr_el1 into tpidr_el2 during kvm init. (Later patches will make this unnecessary for VHE hosts) We print out the vcpu pointer as part of the panic message. Add a back reference to the 'running vcpu' in the host cpu context to preserve this. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
James Morse authored
Commit 36989e7f upstream. kvm_host_cpu_state is a per-cpu allocation made from kvm_arch_init() used to store the host EL1 registers when KVM switches to a guest. Make it easier for ASM to generate pointers into this per-cpu memory by making it a static allocation. Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
James Morse authored
Commit 32b03d10 upstream. KVM uses tpidr_el2 as its private vcpu register, which makes sense for non-vhe world switch as only KVM can access this register. This means vhe Linux has to use tpidr_el1, which KVM has to save/restore as part of the host context. If the SDEI handler code runs behind KVMs back, it mustn't access any per-cpu variables. To allow this on systems with vhe we need to make the host use tpidr_el2, saving KVM from save/restoring it. __guest_enter() stores the host_ctxt on the stack, do the same with the vcpu. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mark Rutland authored
Commit 1b7e2296 upstream. Shortly we will want to load a percpu variable in the return from userspace path. We can save an instruction by folding the addition of the percpu offset into the load instruction, and this patch adds a new helper to do so. At the same time, we clean up this_cpu_ptr for consistency. As with {adr,ldr,str}_l, we change the template to take the destination register first, and name this dst. Secondly, we rename the macro to adr_this_cpu, following the scheme of adr_l, and matching the newly added ldr_this_cpu. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tetsuo Handa authored
commit 3bc53be9 upstream. syzbot is reporting stalls at nfc_llcp_send_ui_frame() [1]. This is because nfc_llcp_send_ui_frame() is retrying the loop without any delay when nonblocking nfc_alloc_send_skb() returned NULL. Since there is no need to use MSG_DONTWAIT if we retry until sock_alloc_send_pskb() succeeds, let's use blocking call. Also, in case an unexpected error occurred, let's break the loop if blocking nfc_alloc_send_skb() failed. [1] https://syzkaller.appspot.com/bug?id=4a131cc571c3733e0eff6bc673f4e36ae48f19c6Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+d29d18215e477cfbfbdd@syzkaller.appspotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Santosh Shilimkar authored
commit f1693c63 upstream. Loop transport which is self loopback, remote port congestion update isn't relevant. Infact the xmit path already ignores it. Receive path needs to do the same. Reported-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florian Westphal authored
commit 84379c9a upstream. Eric Dumazet reports: Here is a reproducer of an annoying bug detected by syzkaller on our production kernel [..] ./b78305423 enable_conntrack Then : sleep 60 dmesg | tail -10 [ 171.599093] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 181.631024] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 191.687076] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 201.703037] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 211.711072] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 221.959070] unregister_netdevice: waiting for lo to become free. Usage count = 2 Reproducer sends ipv6 fragment that hits nfct defrag via LOCAL_OUT hook. skb gets queued until frag timer expiry -- 1 minute. Normally nf_conntrack_reasm gets called during prerouting, so skb has no dst yet which might explain why this wasn't spotted earlier. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Biggers authored
commit c604cb76 upstream. My recent fix for dns_resolver_preparse() printing very long strings was incomplete, as shown by syzbot which still managed to hit the WARN_ONCE() in set_precision() by adding a crafted "dns_resolver" key: precision 50001 too large WARNING: CPU: 7 PID: 864 at lib/vsprintf.c:2164 vsnprintf+0x48a/0x5a0 The bug this time isn't just a printing bug, but also a logical error when multiple options ("#"-separated strings) are given in the key payload. Specifically, when separating an option string into name and value, if there is no value then the name is incorrectly considered to end at the end of the key payload, rather than the end of the current option. This bypasses validation of the option length, and also means that specifying multiple options is broken -- which presumably has gone unnoticed as there is currently only one valid option anyway. A similar problem also applied to option values, as the kstrtoul() when parsing the "dnserror" option will read past the end of the current option and into the next option. Fix these bugs by correctly computing the length of the option name and by copying the option value, null-terminated, into a temporary buffer. Reproducer for the WARN_ONCE() that syzbot hit: perl -e 'print "#A#", "\0" x 50000' | keyctl padd dns_resolver desc @s Reproducer for "dnserror" option being parsed incorrectly (expected behavior is to fail when seeing the unknown option "foo", actual behavior was to read the dnserror value as "1#foo" and fail there): perl -e 'print "#dnserror=1#foo\0"' | keyctl padd dns_resolver desc @s Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: 4a2d7892 ("DNS: If the DNS server returns an error, allow that to be cached [ver #2]") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Biggers authored
commit fe10e398 upstream. ReiserFS prepares log messages into a 1024-byte buffer with no bounds checks. Long messages, such as the "unknown mount option" warning when userspace passes a crafted mount options string, overflow this buffer. This causes KASAN to report a global-out-of-bounds write. Fix it by truncating messages to the buffer size. Link: http://lkml.kernel.org/r/20180707203621.30922-1-ebiggers3@gmail.com Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by: syzbot+b890b3335a4d8c608963@syzkaller.appspotmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Florian Westphal authored
commit 11ff7288 upstream. the ebtables evaluation loop expects targets to return positive values (jumps), or negative values (absolute verdicts). This is completely different from what xtables does. In xtables, targets are expected to return the standard netfilter verdicts, i.e. NF_DROP, NF_ACCEPT, etc. ebtables will consider these as jumps. Therefore reject any target found due to unspec fallback. v2: also reject watchers. ebtables ignores their return value, so a target that assumes skb ownership (and returns NF_STOLEN) causes use-after-free. The only watchers in the 'ebtables' front-end are log and nflog; both have AF_BRIDGE specific wrappers on kernel side. Reported-by: syzbot+2b43f681169a2a0d306a@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stefan Wahren authored
commit dea39aca upstream. The skb size calculation in lan78xx_tx_bh is in race with the start_xmit, which could lead to rare kernel oopses. So protect the whole skb walk with a spin lock. As a benefit we can unlink the skb directly. This patch was tested on Raspberry Pi 3B+ Link: https://github.com/raspberrypi/linux/issues/2608 Fixes: 55d7de9d ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet") Cc: stable <stable@vger.kernel.org> Signed-off-by: Floris Bos <bos@je-eigen-domein.nl> Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ping-Ke Shih authored
commit 9a98302d upstream. Without this patch, firmware will not run properly on rtl8821ae, and it causes bad user experience. For example, bad connection performance with low rate, higher power consumption, and so on. rtl8821ae uses two kinds of firmwares for normal and WoWlan cases, and each firmware has firmware data buffer and size individually. Original code always overwrite size of normal firmware rtlpriv->rtlhal.fwsize, and this mismatch causes firmware checksum error, then firmware can't start. In this situation, driver gives message "Firmware is not ready to run!". Fixes: fe89707f ("rtlwifi: rtl8821ae: Simplify loading of WOWLAN firmware") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Cc: Stable <stable@vger.kernel.org> # 4.0+ Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Gustavo A. R. Silva authored
commit 676bcfec upstream. t.qset_idx can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:2286 cxgb_extension_ioctl() warn: potential spectre issue 'adapter->msix_info' Fix this by sanitizing t.qset_idx before using it to index adapter->msix_info Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alex Vesker authored
[ Upstream commit d412c31d ] The command interface can work in two modes: Events and Polling. In the general case, each time we invoke a command, a work is queued to handle it. When working in events, the interrupt handler completes the command execution. On the other hand, when working in polling mode, the work itself completes it. Due to a bug in the work handler, a command could have been completed by the interrupt handler, while the work handler hasn't finished yet, causing the it to complete once again if the command interface mode was changed from Events to polling after the interrupt handler was called. mlx5_unload_one() mlx5_stop_eqs() // Destroy the EQ before cmd EQ ...cmd_work_handler() write_doorbell() --> EVENT_TYPE_CMD mlx5_cmd_comp_handler() // First free free_ent(cmd, ent->idx) complete(&ent->done) <-- mlx5_stop_eqs //cmd was complete // move to polling before destroying the last cmd EQ mlx5_cmd_use_polling() cmd->mode = POLL; --> cmd_work_handler (continues) if (cmd->mode == POLL) mlx5_cmd_comp_handler() // Double free The solution is to store the cmd->mode before writing the doorbell. Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit 945d015e ] We should put copy_skb in receive_queue only after a successful call to virtio_net_hdr_from_skb(). syzbot report : BUG: KASAN: use-after-free in __skb_unlink include/linux/skbuff.h:1843 [inline] BUG: KASAN: use-after-free in __skb_dequeue include/linux/skbuff.h:1863 [inline] BUG: KASAN: use-after-free in skb_dequeue+0x16a/0x180 net/core/skbuff.c:2815 Read of size 8 at addr ffff8801b044ecc0 by task syz-executor217/4553 CPU: 0 PID: 4553 Comm: syz-executor217 Not tainted 4.18.0-rc1+ #111 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __skb_unlink include/linux/skbuff.h:1843 [inline] __skb_dequeue include/linux/skbuff.h:1863 [inline] skb_dequeue+0x16a/0x180 net/core/skbuff.c:2815 skb_queue_purge+0x26/0x40 net/core/skbuff.c:2852 packet_set_ring+0x675/0x1da0 net/packet/af_packet.c:4331 packet_release+0x630/0xd90 net/packet/af_packet.c:2991 __sock_release+0xd7/0x260 net/socket.c:603 sock_close+0x19/0x20 net/socket.c:1186 __fput+0x35b/0x8b0 fs/file_table.c:209 ____fput+0x15/0x20 fs/file_table.c:243 task_work_run+0x1ec/0x2a0 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x1b08/0x2750 kernel/exit.c:865 do_group_exit+0x177/0x440 kernel/exit.c:968 __do_sys_exit_group kernel/exit.c:979 [inline] __se_sys_exit_group kernel/exit.c:977 [inline] __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:977 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4448e9 Code: Bad RIP value. RSP: 002b:00007ffd5f777ca8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004448e9 RDX: 00000000004448e9 RSI: 000000000000fcfb RDI: 0000000000000001 RBP: 00000000006cf018 R08: 00007ffd0000a45b R09: 0000000000000000 R10: 00007ffd5f777e48 R11: 0000000000000202 R12: 00000000004021f0 R13: 0000000000402280 R14: 0000000000000000 R15: 0000000000000000 Allocated by task 4553: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554 skb_clone+0x1f5/0x500 net/core/skbuff.c:1282 tpacket_rcv+0x28f7/0x3200 net/packet/af_packet.c:2221 deliver_skb net/core/dev.c:1925 [inline] deliver_ptype_list_skb net/core/dev.c:1940 [inline] __netif_receive_skb_core+0x1bfb/0x3680 net/core/dev.c:4611 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4693 netif_receive_skb_internal+0x12e/0x7d0 net/core/dev.c:4767 netif_receive_skb+0xbf/0x420 net/core/dev.c:4791 tun_rx_batched.isra.55+0x4ba/0x8c0 drivers/net/tun.c:1571 tun_get_user+0x2af1/0x42f0 drivers/net/tun.c:1981 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:2009 call_write_iter include/linux/fs.h:1795 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x6c6/0x9f0 fs/read_write.c:487 vfs_write+0x1f8/0x560 fs/read_write.c:549 ksys_write+0x101/0x260 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 4553: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x86/0x2d0 mm/slab.c:3756 kfree_skbmem+0x154/0x230 net/core/skbuff.c:582 __kfree_skb net/core/skbuff.c:642 [inline] kfree_skb+0x1a5/0x580 net/core/skbuff.c:659 tpacket_rcv+0x189e/0x3200 net/packet/af_packet.c:2385 deliver_skb net/core/dev.c:1925 [inline] deliver_ptype_list_skb net/core/dev.c:1940 [inline] __netif_receive_skb_core+0x1bfb/0x3680 net/core/dev.c:4611 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4693 netif_receive_skb_internal+0x12e/0x7d0 net/core/dev.c:4767 netif_receive_skb+0xbf/0x420 net/core/dev.c:4791 tun_rx_batched.isra.55+0x4ba/0x8c0 drivers/net/tun.c:1571 tun_get_user+0x2af1/0x42f0 drivers/net/tun.c:1981 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:2009 call_write_iter include/linux/fs.h:1795 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x6c6/0x9f0 fs/read_write.c:487 vfs_write+0x1f8/0x560 fs/read_write.c:549 ksys_write+0x101/0x260 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8801b044ecc0 which belongs to the cache skbuff_head_cache of size 232 The buggy address is located 0 bytes inside of 232-byte region [ffff8801b044ecc0, ffff8801b044eda8) The buggy address belongs to the page: page:ffffea0006c11380 count:1 mapcount:0 mapping:ffff8801d9be96c0 index:0x0 flags: 0x2fffc0000000100(slab) raw: 02fffc0000000100 ffffea0006c17988 ffff8801d9bec248 ffff8801d9be96c0 raw: 0000000000000000 ffff8801b044e040 000000010000000c 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801b044eb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8801b044ec00: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc >ffff8801b044ec80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ^ ffff8801b044ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801b044ed80: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc Fixes: 58d19b19 ("packet: vnet_hdr support for tpacket_rcv") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jason Wang authored
[ Upstream commit b8f1f658 ] Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when we meet errors during ubuf allocation, the code does not check for NULL before calling sockfd_put(), this will lead NULL dereferencing. Fixing by checking sock pointer before. Fixes: bab632d6 ("vhost: vhost TX zero-copy support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ilpo Järvinen authored
[ Upstream commit 1236f22f ] If SACK is not enabled and the first cumulative ACK after the RTO retransmission covers more than the retransmitted skb, a spurious FRTO undo will trigger (assuming FRTO is enabled for that RTO). The reason is that any non-retransmitted segment acknowledged will set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is no indication that it would have been delivered for real (the scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK case so the check for that bit won't help like it does with SACK). Having FLAG_ORIG_SACK_ACKED set results in the spurious FRTO undo in tcp_process_loss. We need to use more strict condition for non-SACK case and check that none of the cumulatively ACKed segments were retransmitted to prove that progress is due to original transmissions. Only then keep FLAG_ORIG_SACK_ACKED set, allowing FRTO undo to proceed in non-SACK case. (FLAG_ORIG_SACK_ACKED is planned to be renamed to FLAG_ORIG_PROGRESS to better indicate its purpose but to keep this change minimal, it will be done in another patch). Besides burstiness and congestion control violations, this problem can result in RTO loop: When the loss recovery is prematurely undoed, only new data will be transmitted (if available) and the next retransmission can occur only after a new RTO which in case of multiple losses (that are not for consecutive packets) requires one RTO per loss to recover. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuchung Cheng authored
[ Upstream commit c860e997 ] Fast Open key could be stored in different endian based on the CPU. Previously hosts in different endianness in a server farm using the same key config (sysctl value) would produce different cookies. This patch fixes it by always storing it as little endian to keep same API for LE hosts. Reported-by: Daniele Iamartino <danielei@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jiri Slaby authored
[ Upstream commit 0ee1f473 ] When unplugging an r8152 adapter while the interface is UP, the NIC becomes unusable. usb->disconnect (aka rtl8152_disconnect) deletes napi. Then, rtl8152_disconnect calls unregister_netdev and that invokes netdev->ndo_stop (aka rtl8152_close). rtl8152_close tries to napi_disable, but the napi is already deleted by disconnect above. So the first while loop in napi_disable never finishes. This results in complete deadlock of the network layer as there is rtnl_mutex held by unregister_netdev. So avoid the call to napi_disable in rtl8152_close when the device is already gone. The other calls to usb_kill_urb, cancel_delayed_work_sync, netif_stop_queue etc. seem to be fine. The urb and netdev is not destroyed yet. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: linux-usb@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Aleksander Morgado authored
[ Upstream commit e7e197ed ] This module exposes two USB configurations: a QMI+AT capable setup on USB config #1 and a MBIM capable setup on USB config #2. By default the kernel will choose the MBIM capable configuration as long as the cdc_mbim driver is available. This patch adds support for the QMI port in the secondary configuration. Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sudarsana Reddy Kalluru authored
[ Upstream commit bb7858ba ] Memory size is limited in the kdump kernel environment. Allocation of more msix-vectors (or queues) consumes few tens of MBs of memory, which might lead to the kdump kernel failure. This patch adds changes to limit the number of MSI-X vectors in kdump kernel to minimum required value (i.e., 2 per engine). Fixes: fe56b9e6 ("qed: Add module with basic common support") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-