- 07 Mar, 2015 40 commits
-
-
Nadav Amit authored
Guest which sets the PAT CR to invalid value should get a #GP. Currently, if vmx supports loading PAT CR during entry, then the value is not checked. This patch makes the required check in that case. Signed-off-by:
Nadav Amit <namit@cs.technion.ac.il> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 4566654b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Nadav Amit authored
In 64-bit mode a #GP should be delivered to the guest "if the code segment descriptor pointed to by the selector in the 64-bit gate doesn't have the L-bit set and the D-bit clear." - Intel SDM "Interrupt 13—General Protection Exception (#GP)". This patch fixes the behavior of CS loading emulation code. Although the comment says that segment loading is not supported in long mode, this function is executed in long mode, so the fix is necassary. Signed-off-by:
Nadav Amit <namit@cs.technion.ac.il> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 040c8dc8) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Nadav Amit authored
Far jmp/call/ret may fault while loading a new RIP. Currently KVM does not handle this case, and may result in failed vm-entry once the assignment is done. The tricky part of doing so is that loading the new CS affects the VMCS/VMCB state, so if we fail during loading the new RIP, we are left in unconsistent state. Therefore, this patch saves on 64-bit the old CS descriptor and restores it if loading RIP failed. This fixes CVE-2014-3647. Cc: stable@vger.kernel.org Signed-off-by:
Nadav Amit <namit@cs.technion.ac.il> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit d1442d85) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Nadav Amit authored
The KVM emulator code assumes that the guest virtual address space (in 64-bit) is 48-bits wide. Fail the KVM_SET_CPUID and KVM_SET_CPUID2 ioctl if userspace tries to create a guest that does not obey this restriction. Signed-off-by:
Nadav Amit <namit@cs.technion.ac.il> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit dd598091) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Eric Dumazet authored
softnet_data.input_pkt_queue is protected by a spinlock that we must hold when transferring packets from victim queue to an active one. This is because other cpus could still be trying to enqueue packets into victim queue. A second problem is that when we transfert the NAPI poll_list from victim to current cpu, we absolutely need to special case the percpu backlog, because we do not want to add complex locking to protect process_queue : Only owner cpu is allowed to manipulate it, unless cpu is offline. Based on initial patch from Prasad Sodagudi & Subash Abhinov Kasiviswanathan. This version is better because we do not slow down packet processing, only make migration safer. Reported-by:
Prasad Sodagudi <psodagud@codeaurora.org> Reported-by:
Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit ac64da0b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
karl beldan authored
Fixed commit added from64to32 under _#ifndef do_csum_ but used it under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's robot reported TILEGX's). Move from64to32 under the latter. Fixes: 150ae0e9 ("lib/checksum.c: fix carry in csum_tcpudp_nofold") Reported-by:
kbuild test robot <fengguang.wu@intel.com> Signed-off-by:
Karl Beldan <karl.beldan@rivierawaves.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 9ce35779) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Peter Kümmel authored
Warning: In file included from scripts/kconfig/zconf.tab.c:2537:0: scripts/kconfig/menu.c: In function ‘get_symbol_str’: scripts/kconfig/menu.c:590:18: warning: ‘jump’ may be used uninitialized in this function [-Wmaybe-uninitialized] jump->offset = strlen(r->s); Simplifies the test logic because (head && local) means (jump != 0) and makes GCC happy when checking if the jump pointer was initialized. Signed-off-by:
Peter Kümmel <syntheticpp@gmx.net> Signed-off-by:
Michal Marek <mmarek@suse.cz> (cherry picked from commit 2d560306) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
karl beldan authored
Fixed commit added from64to32 under _#ifndef do_csum_ but used it under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's robot reported TILEGX's). Move from64to32 under the latter. Fixes: 150ae0e9 ("lib/checksum.c: fix carry in csum_tcpudp_nofold") Reported-by:
kbuild test robot <fengguang.wu@intel.com> Signed-off-by:
Karl Beldan <karl.beldan@rivierawaves.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
David S. Miller <davem@davemloft.net> tcp: ipv4: initialize unicast_sock sk_pacing_rate When I added sk_pacing_rate field, I forgot to initialize its value in the per cpu unicast_sock used in ip_send_unicast_reply() This means that for sch_fq users, RST packets, or ACK packets sent on behalf of TIME_WAIT sockets might be sent to slowly or even dropped once we reach the per flow limit. Signed-off-by:
Eric Dumazet <edumazet@google.com> Fixes: 95bd09eb ("tcp: TSO packets automatic sizing") Signed-off-by:
David S. Miller <davem@davemloft.net> lib/checksum.c: fix carry in csum_tcpudp_nofold The carry from the 64->32bits folding was dropped, e.g with: saddr=0xFFFFFFFF daddr=0xFF0000FF len=0xFFFF proto=0 sum=1, csum_tcpudp_nofold returned 0 instead of 1. Signed-off-by:
Karl Beldan <karl.beldan@rivierawaves.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Mike Frysinger <vapier@gentoo.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 9ce35779 150ae0e9) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Martin Walch authored
The struct gstr has a capacity that may differ from the actual string length. However, a string manipulation in the function search_conf made the assumption that it is the same, which led to messing up some search results, especially when the content of the gstr in use had not yet reached at least 63 chars. Signed-off-by:
Martin Walch <walch.martin@web.de> Acked-by:
Wang YanQing <udknight@gmail.com> Acked-by:
Benjamin Poirier <bpoirier@suse.de> Reviewed-by:
"Yann E. MORIN" <yann.morin.1998@free.fr> Signed-off-by:
"Yann E. MORIN" <yann.morin.1998@free.fr> (cherry picked from commit 503c8230) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Adrian Hunter authored
Intel SDIO has broken card detect so add a quirk to reflect that. Signed-off-by:
Adrian Hunter <adrian.hunter@intel.com> Acked-by:
Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by:
Chris Ball <chris@printf.net> (cherry picked from commit c6748017) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Nicholas Bellinger authored
This patch drops the arbitrary maximum I/O size limit in sbc_parse_cdb(), which currently for fabric_max_sectors is hardcoded to 8192 (4 MB for 512 byte sector devices), and for hw_max_sectors is a backend driver dependent value. This limit is problematic because Linux initiators have only recently started to honor block limits MAXIMUM TRANSFER LENGTH, and other non-Linux based initiators (eg: MSFT Fibre Channel) can also generate I/Os larger than 4 MB in size. Currently when this happens, the following message will appear on the target resulting in I/Os being returned with non recoverable status: SCSI OP 28h with too big sectors 16384 exceeds fabric_max_sectors: 8192 Instead, drop both [fabric,hw]_max_sector checks in sbc_parse_cdb(), and convert the existing hw_max_sectors into a purely informational attribute used to represent the granuality that backend driver and/or subsystem code is splitting I/Os upon. Also, update FILEIO with an explicit FD_MAX_BYTES check in fd_execute_rw() to deal with the one special iovec limitiation case. v2 changes: - Drop hw_max_sectors check in sbc_parse_cdb() Reported-by:
Lance Gropper <lance.gropper@qosserver.com> Reported-by:
Stefan Priebe <s.priebe@profihost.ag> Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Roland Dreier <roland@purestorage.com> Cc: stable@vger.kernel.org # 3.4 Signed-off-by:
Nicholas Bellinger <nab@linux-iscsi.org> (cherry picked from commit 046ba642) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Hans de Goede authored
Wifi on this laptop does not work unless asus-nb-wmi.wapf=4 is specified on the kerne commandline, add a quirk for this. Cc: stable@vger.kernel.org BugLink: https://bugs.launchpad.net/bugs/1173681Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Darren Hart <dvhart@linux.intel.com> (cherry picked from commit 841e11cc) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Stanislaw Gruszka authored
X550VB as many others Asus laptops need wapf4 quirk to make RFKILL switch be functional. Otherwise system boots with wireless card disabled and is only possible to enable it by suspend/resume. Bug report: http://bugzilla.redhat.com/show_bug.cgi?id=1089731#c23Reported-and-tested-by:
Vratislav Podzimek <vpodzime@redhat.com> Signed-off-by:
Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by:
Darren Hart <dvhart@linux.intel.com> (cherry picked from commit 4ec7a45b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Hans de Goede authored
As reported here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1173681 the U32U needs wapf=4 too. Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Matthew Garrett <matthew.garrett@nebula.com> (cherry picked from commit 831a444e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Hans de Goede authored
As reported here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1173681 the X550CC needs wapf=4 too. Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Matthew Garrett <matthew.garrett@nebula.com> (cherry picked from commit 6d6ded3b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Hans de Goede authored
As reported here: https://bugs.launchpad.net/bugs/1277959 the X550CL needs wapf=4 too. Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Matthew Garrett <matthew.garrett@nebula.com> (cherry picked from commit 22ba58c8) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Chris Wilson authored
In order to act as a full command barrier by itself, we need to tell the pipecontrol to actually stall the command streamer while the flush runs. We require the full command barrier before operations like MI_SET_CONTEXT, which currently rely on a prior invalidate flush. References: https://bugs.freedesktop.org/show_bug.cgi?id=83677 Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by:
Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by:
Jani Nikula <jani.nikula@intel.com> (cherry picked from commit add284a3) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Gwendal Grignou authored
commit d1c7e29e upstream. Before ->start() is called, bufsize size is set to HID_MIN_BUFFER_SIZE, 64 bytes. While processing the IRQ, we were asking to receive up to wMaxInputLength bytes, which can be bigger than 64 bytes. Later, when ->start is run, a proper bufsize will be calculated. Given wMaxInputLength is said to be unreliable in other part of the code, set to receive only what we can even if it results in truncated reports. Signed-off-by:
Gwendal Grignou <gwendal@chromium.org> Reviewed-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> Signed-off-by:
Jiri Slaby <jslaby@suse.cz> (cherry picked from commit f0a431e7)
-
Eric W. Biederman authored
Forced unmount affects not just the mount namespace but the underlying superblock as well. Restrict forced unmount to the global root user for now. Otherwise it becomes possible a user in a less privileged mount namespace to force the shutdown of a superblock of a filesystem in a more privileged mount namespace, allowing a DOS attack on root. Cc: stable@vger.kernel.org Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com> (cherry picked from commit b2f5d4dc) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Ilya Dryomov authored
commit cc9f1f51 upstream. No reason to use BUG_ON for osd request list assertions. Signed-off-by:
Ilya Dryomov <idryomov@redhat.com> Reviewed-by:
Alex Elder <elder@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 26e9bfd9)
-
Ilya Dryomov authored
commit 7c6e6fc5 upstream. It is important that both regular and lingering requests lists are empty when the OSD is removed. Signed-off-by:
Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by:
Alex Elder <elder@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 81113cff)
-
Seth Forshee authored
d1c7e29e (HID: i2c-hid: prevent buffer overflow in early IRQ) changed hid_get_input() to read ihid->bufsize bytes, which can be more than wMaxInputLength. This is the case with the Dell XPS 13 9343, and it is causing events to be missed. In some cases the missed events are releases, which can cause the cursor to jump or freeze, among other problems. Limit the number of bytes read to min(wMaxInputLength, ihid->bufsize) to prevent such problems. Fixes: d1c7e29e "HID: i2c-hid: prevent buffer overflow in early IRQ" Signed-off-by:
Seth Forshee <seth.forshee@canonical.com> Reviewed-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> (cherry picked from commit 6d00f37e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Hannes Frederic Sowa authored
Lubomir Rintel reported that during replacing a route the interface reference counter isn't correctly decremented. To quote bug <https://bugzilla.kernel.org/show_bug.cgi?id=91941>: | [root@rhel7-5 lkundrak]# sh -x lal | + ip link add dev0 type dummy | + ip link set dev0 up | + ip link add dev1 type dummy | + ip link set dev1 up | + ip addr add 2001:db8:8086::2/64 dev dev0 | + ip route add 2001:db8:8086::/48 dev dev0 proto static metric 20 | + ip route add 2001:db8:8088::/48 dev dev1 proto static metric 10 | + ip route replace 2001:db8:8086::/48 dev dev1 proto static metric 20 | + ip link del dev0 type dummy | Message from syslogd@rhel7-5 at Jan 23 10:54:41 ... | kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2 | | Message from syslogd@rhel7-5 at Jan 23 10:54:51 ... | kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2 During replacement of a rt6_info we must walk all parent nodes and check if the to be replaced rt6_info got propagated. If so, replace it with an alive one. Fixes: 4a287eba ("IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag") Reported-by:
Lubomir Rintel <lkundrak@v3.sk> Signed-off-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Tested-by:
Lubomir Rintel <lkundrak@v3.sk> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 6e9e16e6) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
karl beldan authored
Fixed commit added from64to32 under _#ifndef do_csum_ but used it under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's robot reported TILEGX's). Move from64to32 under the latter. Fixes: 150ae0e9 ("lib/checksum.c: fix carry in csum_tcpudp_nofold") Reported-by:
kbuild test robot <fengguang.wu@intel.com> Signed-off-by:
Karl Beldan <karl.beldan@rivierawaves.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 9ce35779) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
karl beldan authored
Fixed commit added from64to32 under _#ifndef do_csum_ but used it under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's robot reported TILEGX's). Move from64to32 under the latter. Fixes: 150ae0e9 ("lib/checksum.c: fix carry in csum_tcpudp_nofold") Reported-by:
kbuild test robot <fengguang.wu@intel.com> Signed-off-by:
Karl Beldan <karl.beldan@rivierawaves.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
David S. Miller <davem@davemloft.net> tcp: ipv4: initialize unicast_sock sk_pacing_rate When I added sk_pacing_rate field, I forgot to initialize its value in the per cpu unicast_sock used in ip_send_unicast_reply() This means that for sch_fq users, RST packets, or ACK packets sent on behalf of TIME_WAIT sockets might be sent to slowly or even dropped once we reach the per flow limit. Signed-off-by:
Eric Dumazet <edumazet@google.com> Fixes: 95bd09eb ("tcp: TSO packets automatic sizing") Signed-off-by:
David S. Miller <davem@davemloft.net> lib/checksum.c: fix carry in csum_tcpudp_nofold The carry from the 64->32bits folding was dropped, e.g with: saddr=0xFFFFFFFF daddr=0xFF0000FF len=0xFFFF proto=0 sum=1, csum_tcpudp_nofold returned 0 instead of 1. Signed-off-by:
Karl Beldan <karl.beldan@rivierawaves.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Mike Frysinger <vapier@gentoo.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 9ce35779 150ae0e9) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David Vrabel authored
This reverts commit 2c3fc8d2. This commit broke on x86 PV because entries in the generic SWIOTLB are indexed using (pseudo-)physical address not DMA address and these are not the same in a x86 PV guest. Signed-off-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single" This reverts commit 2c3fc8d2. This commit broke on x86 PV because entries in the generic SWIOTLB are indexed using (pseudo-)physical address not DMA address and these are not the same in a x86 PV guest. Signed-off-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> xen: annotate xen_set_identity_and_remap_chunk() with __init Commit 5b8e7d80 removed the __init annotation from xen_set_identity_and_remap_chunk(). Add it again. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: introduce helper functions to do safe read and write accesses Introduce two helper functions to safely read and write unsigned long values from or to memory when the access may fault because the mapping is non-present or read-only. These helpers can be used instead of open coded uses of __get_user() and __put_user() avoiding the need to do casts to fix sparse warnings. Use the helpers in page.h and p2m.c. This will fix the sparse warnings when doing "make C=1". Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Speed up set_phys_to_machine() by using read-only mappings Instead of checking at each call of set_phys_to_machine() whether a new p2m page has to be allocated due to writing an entry in a large invalid or identity area, just map those areas read only and react to a page fault on write by allocating the new page. This change will make the common path with no allocation much faster as it only requires a single write of the new mfn instead of walking the address translation tables and checking for the special cases. Suggested-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: switch to linear virtual mapped sparse p2m list At start of the day the Xen hypervisor presents a contiguous mfn list to a pv-domain. In order to support sparse memory this mfn list is accessed via a three level p2m tree built early in the boot process. Whenever the system needs the mfn associated with a pfn this tree is used to find the mfn. Instead of using a software walked tree for accessing a specific mfn list entry this patch is creating a virtual address area for the entire possible mfn list including memory holes. The holes are covered by mapping a pre-defined page consisting only of "invalid mfn" entries. Access to a mfn entry is possible by just using the virtual base address of the mfn list and the pfn as index into that list. This speeds up the (hot) path of determining the mfn of a pfn. Kernel build on a Dell Latitude E6440 (2 cores, HT) in 64 bit Dom0 showed following improvements: Elapsed time: 32:50 -> 32:35 System: 18:07 -> 17:47 User: 104:00 -> 103:30 Tested with following configurations: - 64 bit dom0, 8GB RAM - 64 bit dom0, 128 GB RAM, PCI-area above 4 GB - 32 bit domU, 512 MB, 8 GB, 43 GB (more wouldn't work even without the patch) - 32 bit domU, ballooning up and down - 32 bit domU, save and restore - 32 bit domU with PCI passthrough - 64 bit domU, 8 GB, 2049 MB, 5000 MB - 64 bit domU, ballooning up and down - 64 bit domU, save and restore - 64 bit domU with PCI passthrough Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Hide get_phys_to_machine() to be able to tune common path Today get_phys_to_machine() is always called when the mfn for a pfn is to be obtained. Add a wrapper __pfn_to_mfn() as inline function to be able to avoid calling get_phys_to_machine() when possible as soon as the switch to a linear mapped p2m list has been done. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> x86: Introduce function to get pmd entry pointer Introduces lookup_pmd_address() to get the address of the pmd entry related to a virtual address in the current address space. This function is needed for support of a virtual mapped sparse p2m list in xen pv domains, as we need the address of the pmd entry, not the one of the pte in that case. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Delay invalidating extra memory When the physical memory configuration is initialized the p2m entries for not pouplated memory pages are set to "invalid". As those pages are beyond the hypervisor built p2m list the p2m tree has to be extended. This patch delays processing the extra memory related p2m entries during the boot process until some more basic memory management functions are callable. This removes the need to create new p2m entries until virtual memory management is available. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Delay m2p_override initialization The m2p overrides are used to be able to find the local pfn for a foreign mfn mapped into the domain. They are used by driver backends having to access frontend data. As this functionality isn't used in early boot it makes no sense to initialize the m2p override functions very early. It can be done later without doing any harm, removing the need for allocating memory via extend_brk(). While at it make some m2p override functions static as they are only used internally. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Delay remapping memory of pv-domain Early in the boot process the memory layout of a pv-domain is changed to match the E820 map (either the host one for Dom0 or the Xen one) regarding placement of RAM and PCI holes. This requires removing memory pages initially located at positions not suitable for RAM and adding them later at higher addresses where no restrictions apply. To be able to operate on the hypervisor supported p2m list until a virtual mapped linear p2m list can be constructed, remapping must be delayed until virtual memory management is initialized, as the initial p2m list can't be extended unlimited at physical memory initialization time due to it's fixed structure. A further advantage is the reduction in complexity and code volume as we don't have to be careful regarding memory restrictions during p2m updates. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: use common page allocation function in p2m.c In arch/x86/xen/p2m.c three different allocation functions for obtaining a memory page are used: extend_brk(), alloc_bootmem_align() or __get_free_page(). Which of those functions is used depends on the progress of the boot process of the system. Introduce a common allocation routine selecting the to be called allocation routine dynamically based on the boot progress. This allows moving initialization steps without having to care about changing allocation calls. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Make functions static Some functions in arch/x86/xen/p2m.c are used locally only. Make them static. Rearrange the functions in p2m.c to avoid forward declarations. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: fix some style issues in p2m.c The source arch/x86/xen/p2m.c has some coding style issues. Fix them. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> (cherry picked from commit dbdd7476 4ef8e3f3) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Chris Wilson authored
In order to act as a full command barrier by itself, we need to tell the pipecontrol to actually stall the command streamer while the flush runs. We require the full command barrier before operations like MI_SET_CONTEXT, which currently rely on a prior invalidate flush. References: https://bugs.freedesktop.org/show_bug.cgi?id=83677 Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by:
Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by:
Jani Nikula <jani.nikula@intel.com> (cherry picked from commit add284a3) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Gwendal Grignou authored
commit d1c7e29e upstream. Before ->start() is called, bufsize size is set to HID_MIN_BUFFER_SIZE, 64 bytes. While processing the IRQ, we were asking to receive up to wMaxInputLength bytes, which can be bigger than 64 bytes. Later, when ->start is run, a proper bufsize will be calculated. Given wMaxInputLength is said to be unreliable in other part of the code, set to receive only what we can even if it results in truncated reports. Signed-off-by:
Gwendal Grignou <gwendal@chromium.org> Reviewed-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 6d7f4a3b)
-
Stefano Stabellini authored
This reverts commit 2c3fc8d2. This commit broke on x86 PV because entries in the generic SWIOTLB are indexed using (pseudo-)physical address not DMA address and these are not the same in a x86 PV guest. Signed-off-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single" This reverts commit 2c3fc8d2. This commit broke on x86 PV because entries in the generic SWIOTLB are indexed using (pseudo-)physical address not DMA address and these are not the same in a x86 PV guest. Signed-off-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> xen: annotate xen_set_identity_and_remap_chunk() with __init Commit 5b8e7d80 removed the __init annotation from xen_set_identity_and_remap_chunk(). Add it again. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: introduce helper functions to do safe read and write accesses Introduce two helper functions to safely read and write unsigned long values from or to memory when the access may fault because the mapping is non-present or read-only. These helpers can be used instead of open coded uses of __get_user() and __put_user() avoiding the need to do casts to fix sparse warnings. Use the helpers in page.h and p2m.c. This will fix the sparse warnings when doing "make C=1". Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Speed up set_phys_to_machine() by using read-only mappings Instead of checking at each call of set_phys_to_machine() whether a new p2m page has to be allocated due to writing an entry in a large invalid or identity area, just map those areas read only and react to a page fault on write by allocating the new page. This change will make the common path with no allocation much faster as it only requires a single write of the new mfn instead of walking the address translation tables and checking for the special cases. Suggested-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: switch to linear virtual mapped sparse p2m list At start of the day the Xen hypervisor presents a contiguous mfn list to a pv-domain. In order to support sparse memory this mfn list is accessed via a three level p2m tree built early in the boot process. Whenever the system needs the mfn associated with a pfn this tree is used to find the mfn. Instead of using a software walked tree for accessing a specific mfn list entry this patch is creating a virtual address area for the entire possible mfn list including memory holes. The holes are covered by mapping a pre-defined page consisting only of "invalid mfn" entries. Access to a mfn entry is possible by just using the virtual base address of the mfn list and the pfn as index into that list. This speeds up the (hot) path of determining the mfn of a pfn. Kernel build on a Dell Latitude E6440 (2 cores, HT) in 64 bit Dom0 showed following improvements: Elapsed time: 32:50 -> 32:35 System: 18:07 -> 17:47 User: 104:00 -> 103:30 Tested with following configurations: - 64 bit dom0, 8GB RAM - 64 bit dom0, 128 GB RAM, PCI-area above 4 GB - 32 bit domU, 512 MB, 8 GB, 43 GB (more wouldn't work even without the patch) - 32 bit domU, ballooning up and down - 32 bit domU, save and restore - 32 bit domU with PCI passthrough - 64 bit domU, 8 GB, 2049 MB, 5000 MB - 64 bit domU, ballooning up and down - 64 bit domU, save and restore - 64 bit domU with PCI passthrough Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Hide get_phys_to_machine() to be able to tune common path Today get_phys_to_machine() is always called when the mfn for a pfn is to be obtained. Add a wrapper __pfn_to_mfn() as inline function to be able to avoid calling get_phys_to_machine() when possible as soon as the switch to a linear mapped p2m list has been done. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> x86: Introduce function to get pmd entry pointer Introduces lookup_pmd_address() to get the address of the pmd entry related to a virtual address in the current address space. This function is needed for support of a virtual mapped sparse p2m list in xen pv domains, as we need the address of the pmd entry, not the one of the pte in that case. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Delay invalidating extra memory When the physical memory configuration is initialized the p2m entries for not pouplated memory pages are set to "invalid". As those pages are beyond the hypervisor built p2m list the p2m tree has to be extended. This patch delays processing the extra memory related p2m entries during the boot process until some more basic memory management functions are callable. This removes the need to create new p2m entries until virtual memory management is available. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Delay m2p_override initialization The m2p overrides are used to be able to find the local pfn for a foreign mfn mapped into the domain. They are used by driver backends having to access frontend data. As this functionality isn't used in early boot it makes no sense to initialize the m2p override functions very early. It can be done later without doing any harm, removing the need for allocating memory via extend_brk(). While at it make some m2p override functions static as they are only used internally. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Delay remapping memory of pv-domain Early in the boot process the memory layout of a pv-domain is changed to match the E820 map (either the host one for Dom0 or the Xen one) regarding placement of RAM and PCI holes. This requires removing memory pages initially located at positions not suitable for RAM and adding them later at higher addresses where no restrictions apply. To be able to operate on the hypervisor supported p2m list until a virtual mapped linear p2m list can be constructed, remapping must be delayed until virtual memory management is initialized, as the initial p2m list can't be extended unlimited at physical memory initialization time due to it's fixed structure. A further advantage is the reduction in complexity and code volume as we don't have to be careful regarding memory restrictions during p2m updates. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: use common page allocation function in p2m.c In arch/x86/xen/p2m.c three different allocation functions for obtaining a memory page are used: extend_brk(), alloc_bootmem_align() or __get_free_page(). Which of those functions is used depends on the progress of the boot process of the system. Introduce a common allocation routine selecting the to be called allocation routine dynamically based on the boot progress. This allows moving initialization steps without having to care about changing allocation calls. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: Make functions static Some functions in arch/x86/xen/p2m.c are used locally only. Make them static. Rearrange the functions in p2m.c to avoid forward declarations. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen: fix some style issues in p2m.c The source arch/x86/xen/p2m.c has some coding style issues. Fix them. Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pci: Use APIC directly when APIC virtualization hardware is available When hardware supports APIC/x2APIC virtualization we don't need to use pirqs for MSI handling and instead use APIC since most APIC accesses (MMIO or MSR) will now be processed without VMEXITs. As an example, netperf on the original code produces this profile (collected wih 'xentrace -e 0x0008ffff -T 5'): 342 cpu_change 260 CPUID 34638 HLT 64067 INJ_VIRQ 28374 INTR 82733 INTR_WINDOW 10 NPF 24337 TRAP 370610 vlapic_accept_pic_intr 307528 VMENTRY 307527 VMEXIT 140998 VMMCALL 127 wrap_buffer After applying this patch the same test shows 230 cpu_change 260 CPUID 36542 HLT 174 INJ_VIRQ 27250 INTR 222 INTR_WINDOW 20 NPF 24999 TRAP 381812 vlapic_accept_pic_intr 166480 VMENTRY 166479 VMEXIT 77208 VMMCALL 81 wrap_buffer ApacheBench results (ab -n 10000 -c 200) improve by about 10% Signed-off-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by:
Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pci: Defer initialization of MSI ops on HVM guests If the hardware supports APIC virtualization we may decide not to use pirqs and instead use APIC/x2APIC directly, meaning that we don't want to set x86_msi.setup_msi_irqs and x86_msi.teardown_msi_irq to Xen-specific routines. However, x2APIC is not set up by the time pci_xen_hvm_init() is called so we need to postpone setting these ops until later, when we know which APIC mode is used. (Note that currently x2APIC is never initialized on HVM guests. This may change in the future) Signed-off-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen-pciback: drop SR-IOV VFs when PF driver unloads When a PF driver unloads, it may find it necessary to leave the VFs around simply because of pciback having marked them as assigned to a guest. Utilize a suitable notification to let go of the VFs, thus allowing the PF to go back into the state it was before its driver loaded (which in particular allows the driver to be loaded again with it being able to create the VFs anew, but which also allows to then pass through the PF instead of the VFs). Don't do this however for any VFs currently in active use by a guest. Signed-off-by:
Jan Beulich <jbeulich@suse.com> [v2: Removed the switch statement, moved it about] [v3: Redid it a bit differently] Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pciback: Restore configuration space when detaching from a guest. The commit "xen/pciback: Don't deadlock when unbinding." was using the version of pci_reset_function which would lock the device lock. That is no good as we can dead-lock. As such we swapped to using the lock-less version and requiring that the callers of 'pcistub_put_pci_dev' take the device lock. And as such this bug got exposed. Using the lock-less version is OK, except that we tried to use 'pci_restore_state' after the lock-less version of __pci_reset_function_locked - which won't work as 'state_saved' is set to false. Said 'state_saved' is a toggle boolean that is to be used by the sequence of a) pci_save_state/pci_restore_state or b) pci_load_and_free_saved_state/pci_restore_state. We don't want to use a) as the guest might have messed up the PCI configuration space and we want it to revert to the state when the PCI device was binded to us. Therefore we pick b) to restore the configuration space. We restore from our 'golden' version of PCI configuration space, when an: - Device is unbinded from pciback - Device is detached from a guest. Reported-by:
Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> PCI: Expose pci_load_saved_state for public consumption. We have the pci_load_and_free_saved_state, and pci_store_saved_state but are missing the functionality to just load the state multiple times in the PCI device without having to free/save the state. This patch makes it possible to use this function. Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pciback: Remove tons of dereferences A little cleanup. No functional difference. Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pciback: Print out the domain owning the device. We had been printing it only if the device was built with debug enabled. But this information is useful in the field to troubleshoot. Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pciback: Include the domain id if removing the device whilst still in use Cleanup the function a bit - also include the id of the domain that is using the device. Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by:
David Vrabel <david.vrabel@citrix.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> driver core: Provide an wrapper around the mutex to do lockdep warnings Instead of open-coding it in drivers that want to double check that their functions are indeed holding the device lock. Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Suggested-by:
David Vrabel <david.vrabel@citrix.com> Acked-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> xen/pciback: Don't deadlock when unbinding. As commit 0a9fd015 'xen/pciback: Document the entry points for 'pcistub_put_pci_dev'' explained there are four entry points in this function. Two of them are when the user fiddles in the SysFS to unbind a device which might be in use by a guest or not. Both 'unbind' states will cause a deadlock as the the PCI lock has already been taken, which then pci_device_reset tries to take. We can simplify this by requiring that all callers of pcistub_put_pci_dev MUST hold the device lock. And then we can just call the lockless version of pci_device_reset. To make it even simpler we will modify xen_pcibk_release_pci_dev to quality whether it should take a lock or not - as it ends up calling xen_pcibk_release_pci_dev and needs to hold the lock. Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com> swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single Need to pass the pointer within the swiotlb internal buffer to the swiotlb library, that in the case of xen_unmap_single is dev_addr, not paddr. Signed-off-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: stable@vger.kernel.org (cherry picked from commit dbdd7476 4ef8e3f3 2c3fc8d2) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Marcelo Tosatti authored
When the guest writes to the TSC, the masterclock TSC copy must be updated as well along with the TSC_OFFSET update, otherwise a negative tsc_timestamp is calculated at kvm_guest_time_update. Once "if (!vcpus_matched && ka->use_master_clock)" is simplified to "if (ka->use_master_clock)", the corresponding "if (!ka->use_master_clock)" becomes redundant, so remove the do_request boolean and collapse everything into a single condition. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 7f187922) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jan Kara authored
Check length of extended attributes and allocation descriptors when loading inodes from disk. Otherwise corrupted filesystems could confuse the code and make the kernel oops. Reported-by:
Carl Henrik Lunde <chlunde@ping.uio.no> CC: stable@vger.kernel.org Signed-off-by:
Jan Kara <jack@suse.cz> (cherry picked from commit 23b133bd) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Seth Forshee authored
d1c7e29e (HID: i2c-hid: prevent buffer overflow in early IRQ) changed hid_get_input() to read ihid->bufsize bytes, which can be more than wMaxInputLength. This is the case with the Dell XPS 13 9343, and it is causing events to be missed. In some cases the missed events are releases, which can cause the cursor to jump or freeze, among other problems. Limit the number of bytes read to min(wMaxInputLength, ihid->bufsize) to prevent such problems. Fixes: d1c7e29e "HID: i2c-hid: prevent buffer overflow in early IRQ" Signed-off-by:
Seth Forshee <seth.forshee@canonical.com> Reviewed-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> (cherry picked from commit 6d00f37e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Matej Dubovy authored
Please add support for sub BT chip on the combo card Broadcom 43142A0 (in Lenovo E145), 04ca:2007 /sys/kernel/debug/usb/devices T: Bus=05 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#= 3 Spd=12 MxCh= 0 D: Ver= 2.00 Cls=ff(vend.) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=04ca ProdID=2007 Rev= 1.12 S: Manufacturer=Broadcom Corp S: Product=BCM43142A0 S: SerialNumber=28E347EC73BD C:* #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr= 0mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=01 Prot=01 Driver=(none) E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=(none) E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=(none) E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=(none) E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=(none) E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=(none) E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=(none) E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=84(I) Atr=02(Bulk) MxPS= 32 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS= 32 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 0 Cls=fe(app. ) Sub=01 Prot=01 Driver=(none) Firmware for 04ca:2007 can be extracted from the latest Lenovo E145 Bluetooth driver for Windows (driver is however described as BCM20702 but contains also firwmare for BCM43142). Search for BCM43142A0_001.001.011.0122.0153.hex within hex files, then it must be converted using hex2hcd utility. Rename file to BCM43142A0-04ca-2007.hcd, then move to /lib/firmware/brcm/. Signed-off-by:
Matej Dubovy <matej.dubovy@gmail.com> Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org (cherry picked from commit 8f0c304c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Austin Lund authored
Userspace expects to see a long space before the first pulse is sent on the lirc device. Currently, if a long time has passed and a new packet is started, the lirc codec just returns and doesn't send anything. This makes lircd ignore many perfectly valid signals unless they are sent in quick sucession. When a reset event is delivered, we cannot know anything about the duration of the space. But it should be safe to assume it has been a long time and we just set the duration to maximum. Signed-off-by:
Austin Lund <austin.lund@gmail.com> Signed-off-by:
Mauro Carvalho Chehab <mchehab@osg.samsung.com> (cherry picked from commit a8f29e89) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Saran Maruti Ramanara authored
When making use of RFC5061, section 4.2.4. for setting the primary IP address, we're passing a wrong parameter header to param_type2af(), resulting always in NULL being returned. At this point, param.p points to a sctp_addip_param struct, containing a sctp_paramhdr (type = 0xc004, length = var), and crr_id as a correlation id. Followed by that, as also presented in RFC5061 section 4.2.4., comes the actual sctp_addr_param, which also contains a sctp_paramhdr, but this time with the correct type SCTP_PARAM_IPV{4,6}_ADDRESS that param_type2af() can make use of. Since we already hold a pointer to addr_param from previous line, just reuse it for param_type2af(). Fixes: d6de3097 ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT") Signed-off-by:
Saran Maruti Ramanara <saran.neti@telus.com> Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Acked-by:
Vlad Yasevich <vyasevich@gmail.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit cfbf654e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Florian Westphal authored
When we've run out of space in the output buffer to store more data, we will call zlib_deflate with a NULL output buffer until we've consumed remaining input. When this happens, olen contains the size the output buffer would have consumed iff we'd have had enough room. This can later cause skb_over_panic when ppp_generic skb_put()s the returned length. Reported-by:
Iain Douglas <centos@1n6.org.uk> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit e2a4800e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Eric Dumazet authored
In commit be9f4a44 ("ipv4: tcp: remove per net tcp_sock") I tried to address contention on a socket lock, but the solution I chose was horrible : commit 3a7c384f ("ipv4: tcp: unicast_sock should not land outside of TCP stack") addressed a selinux regression. commit 0980e56e ("ipv4: tcp: set unicast_sock uc_ttl to -1") took care of another regression. commit b5ec8eea ("ipv4: fix ip_send_skb()") fixed another regression. commit 811230cd ("tcp: ipv4: initialize unicast_sock sk_pacing_rate") was another shot in the dark. Really, just use a proper socket per cpu, and remove the skb_orphan() call, to re-enable flow control. This solves a serious problem with FQ packet scheduler when used in hostile environments, as we do not want to allocate a flow structure for every RST packet sent in response to a spoofed packet. Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> net: sched: fix panic in rate estimators Doing the following commands on a non idle network device panics the box instantly, because cpu_bstats gets overwritten by stats. tc qdisc add dev eth0 root <your_favorite_qdisc> ... some traffic (one packet is enough) ... tc qdisc replace dev eth0 root est 1sec 4sec <your_favorite_qdisc> [ 325.355596] BUG: unable to handle kernel paging request at ffff8841dc5a074c [ 325.362609] IP: [<ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90 [ 325.369158] PGD 1fa7067 PUD 0 [ 325.372254] Oops: 0000 [#1] SMP [ 325.375514] Modules linked in: ... [ 325.398346] CPU: 13 PID: 14313 Comm: tc Not tainted 3.19.0-smp-DEV #1163 [ 325.412042] task: ffff8800793ab5d0 ti: ffff881ff2fa4000 task.ti: ffff881ff2fa4000 [ 325.419518] RIP: 0010:[<ffffffff81541c9e>] [<ffffffff81541c9e>] __gnet_stats_copy_basic+0x3e/0x90 [ 325.428506] RSP: 0018:ffff881ff2fa7928 EFLAGS: 00010286 [ 325.433824] RAX: 000000000000000c RBX: ffff881ff2fa796c RCX: 000000000000000c [ 325.440988] RDX: ffff8841dc5a0744 RSI: 0000000000000060 RDI: 0000000000000060 [ 325.448120] RBP: ffff881ff2fa7948 R08: ffffffff81cd4f80 R09: 0000000000000000 [ 325.455268] R10: ffff883ff223e400 R11: 0000000000000000 R12: 000000015cba0744 [ 325.462405] R13: ffffffff81cd4f80 R14: ffff883ff223e460 R15: ffff883feea0722c [ 325.469536] FS: 00007f2ee30fa700(0000) GS:ffff88407fa20000(0000) knlGS:0000000000000000 [ 325.477630] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 325.483380] CR2: ffff8841dc5a074c CR3: 0000003feeae9000 CR4: 00000000001407e0 [ 325.490510] Stack: [ 325.492524] ffff883feea0722c ffff883fef719dc0 ffff883feea0722c ffff883ff223e4a0 [ 325.499990] ffff881ff2fa79a8 ffffffff815424ee ffff883ff223e49c 000000015cba0744 [ 325.507460] 00000000f2fa7978 0000000000000000 ffff881ff2fa79a8 ffff883ff223e4a0 [ 325.514956] Call Trace: [ 325.517412] [<ffffffff815424ee>] gen_new_estimator+0x8e/0x230 [ 325.523250] [<ffffffff815427aa>] gen_replace_estimator+0x4a/0x60 [ 325.529349] [<ffffffff815718ab>] tc_modify_qdisc+0x52b/0x590 [ 325.535117] [<ffffffff8155edd0>] rtnetlink_rcv_msg+0xa0/0x240 [ 325.540963] [<ffffffff8155ed30>] ? __rtnl_unlock+0x20/0x20 [ 325.546532] [<ffffffff8157f811>] netlink_rcv_skb+0xb1/0xc0 [ 325.552145] [<ffffffff8155b355>] rtnetlink_rcv+0x25/0x40 [ 325.557558] [<ffffffff8157f0d8>] netlink_unicast+0x168/0x220 [ 325.563317] [<ffffffff8157f47c>] netlink_sendmsg+0x2ec/0x3e0 Lets play safe and not use an union : percpu 'pointers' are mostly read anyway, and we have typically few qdiscs per host. Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: John Fastabend <john.fastabend@gmail.com> Fixes: 22e0f8b9 ("net: sched: make bstats per cpu and estimator RCU safe") Signed-off-by:
David S. Miller <davem@davemloft.net> hyperv: Fix the error processing in netvsc_send() The existing code frees the skb in EAGAIN case, in which the skb will be retried from upper layer and used again. Also, the existing code doesn't free send buffer slot in error case, because there is no completion message for unsent packets. This patch fixes these problems. (Please also include this patch for stable trees. Thanks!) Signed-off-by:
Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by:
K. Y. Srinivasan <kys@microsoft.com> Signed-off-by:
David S. Miller <davem@davemloft.net> drivers: net: xgene: fix: Out of order descriptor bytes read This patch fixes the following kernel crash, WARNING: CPU: 2 PID: 0 at net/ipv4/tcp_input.c:3079 tcp_clean_rtx_queue+0x658/0x80c() Call trace: [<fffffe0000096b7c>] dump_backtrace+0x0/0x184 [<fffffe0000096d10>] show_stack+0x10/0x1c [<fffffe0000685ea0>] dump_stack+0x74/0x98 [<fffffe00000b44e0>] warn_slowpath_common+0x88/0xb0 [<fffffe00000b461c>] warn_slowpath_null+0x14/0x20 [<fffffe00005b5c1c>] tcp_clean_rtx_queue+0x654/0x80c [<fffffe00005b6228>] tcp_ack+0x454/0x688 [<fffffe00005b6ca8>] tcp_rcv_established+0x4a4/0x62c [<fffffe00005bf4b4>] tcp_v4_do_rcv+0x16c/0x350 [<fffffe00005c225c>] tcp_v4_rcv+0x8e8/0x904 [<fffffe000059d470>] ip_local_deliver_finish+0x100/0x26c [<fffffe000059dad8>] ip_local_deliver+0xac/0xc4 [<fffffe000059d6c4>] ip_rcv_finish+0xe8/0x328 [<fffffe000059dd3c>] ip_rcv+0x24c/0x38c [<fffffe0000563950>] __netif_receive_skb_core+0x29c/0x7c8 [<fffffe0000563ea4>] __netif_receive_skb+0x28/0x7c [<fffffe0000563f54>] netif_receive_skb_internal+0x5c/0xe0 [<fffffe0000564810>] napi_gro_receive+0xb4/0x110 [<fffffe0000482a2c>] xgene_enet_process_ring+0x144/0x338 [<fffffe0000482d18>] xgene_enet_napi+0x1c/0x50 [<fffffe0000565454>] net_rx_action+0x154/0x228 [<fffffe00000b804c>] __do_softirq+0x110/0x28c [<fffffe00000b8424>] irq_exit+0x8c/0xc0 [<fffffe0000093898>] handle_IRQ+0x44/0xa8 [<fffffe000009032c>] gic_handle_irq+0x38/0x7c [...] Software writes poison data into the descriptor bytes[15:8] and upon receiving the interrupt, if those bytes are overwritten by the hardware with the valid data, software also reads bytes[7:0] and executes receive/tx completion logic. If the CPU executes the above two reads in out of order fashion, then the bytes[7:0] will have older data and causing the kernel panic. We have to force the order of the reads and thus this patch introduces read memory barrier between these reads. Signed-off-by:
Iyappan Subramanian <isubramanian@apm.com> Signed-off-by:
Keyur Chudgar <kchudgar@apm.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Merge branch 'vlan_get_protocol' Toshiaki Makita says: ==================== Fix checksum error when using stacked vlan When I was testing 802.1ad, I found several drivers don't take into account 802.1ad or multiple vlans when retrieving L3 (IP/IPv6) or L4 (TCP/UDP) protocol for checksum offload. It is mainly due to vlan_get_protocol(), which extracts ether type only when it is tagged with single 802.1Q. When 802.1ad is used or there are multiple vlans, it extracts vlan protocol and drivers cannot determine which L3/L4 protocol is used. Those drivers, most of which have IP_CSUM/IPV6_CSUM features, get L3/L4 header-offset by software, so it seems that their checksum offload works with multiple vlans if we can parse protocols correctly. (They know mac header length, and probably don't care about what is in it.) And another thing, some of Intel's drivers seem to use skb->protocol where vlan_get_protocol() is more suitable. I tested that at least igb/igbvf on I350 works with this patch set. Note: We can hand a double tagged packet with CHECKSUM_PARTIAL to a HW driver by creating a vlan device on a bridge device and enabling vlan_filtering of the bridge with 802.1ad protocol. ==================== Signed-off-by:
David S. Miller <davem@davemloft.net> ixgbevf: Fix checksum error when using stacked vlan When a skb has multiple vlans and it is CHECKSUM_PARTIAL, ixgbevf_tx_csum() fails to get the network protocol and checksum related descriptor fields are not configured correctly because skb->protocol doesn't show the L3 protocol in this case. Use first->protocol instead of skb->protocol to get the proper network protocol. Signed-off-by:
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by:
David S. Miller <davem@davemloft.net> ixgbe: Fix checksum error when using stacked vlan When a skb has multiple vlans and it is CHECKSUM_PARTIAL, ixgbe_tx_csum() fails to get the network protocol and checksum related descriptor fields are not configured correctly because skb->protocol doesn't show the L3 protocol in this case. Use vlan_get_protocol() to get the proper network protocol. Signed-off-by:
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by:
David S. Miller <davem@davemloft.net> igbvf: Fix checksum error when using stacked vlan When a skb has multiple vlans and it is CHECKSUM_PARTIAL, igbvf_tx_csum() fails to get the network protocol and checksum related descriptor fields are not configured correctly because skb->protocol doesn't show the L3 protocol in this case. Use vlan_get_protocol() to get the proper network protocol. Signed-off-by:
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by:
David S. Miller <davem@davemloft.net> net: Fix vlan_get_protocol for stacked vlan vlan_get_protocol() could not get network protocol if a skb has a 802.1ad vlan tag or multiple vlans, which caused incorrect checksum calculation in several drivers. Fix vlan_get_protocol() to retrieve network protocol instead of incorrect vlan protocol. As the logic is the same as skb_network_protocol(), create a common helper function __vlan_get_protocol() and call it from existing functions. Signed-off-by:
Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by:
David S. Miller <davem@davemloft.net> net: sctp: fix passing wrong parameter header to param_type2af in sctp_process_param When making use of RFC5061, section 4.2.4. for setting the primary IP address, we're passing a wrong parameter header to param_type2af(), resulting always in NULL being returned. At this point, param.p points to a sctp_addip_param struct, containing a sctp_paramhdr (type = 0xc004, length = var), and crr_id as a correlation id. Followed by that, as also presented in RFC5061 section 4.2.4., comes the actual sctp_addr_param, which also contains a sctp_paramhdr, but this time with the correct type SCTP_PARAM_IPV{4,6}_ADDRESS that param_type2af() can make use of. Since we already hold a pointer to addr_param from previous line, just reuse it for param_type2af(). Fixes: d6de3097 ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT") Signed-off-by:
Saran Maruti Ramanara <saran.neti@telus.com> Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Acked-by:
Vlad Yasevich <vyasevich@gmail.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
David S. Miller <davem@davemloft.net> netlink: fix wrong subscription bitmask to group mapping in The subscription bitmask passed via struct sockaddr_nl is converted to the group number when calling the netlink_bind() and netlink_unbind() callbacks. The conversion is however incorrect since bitmask (1 << 0) needs to be mapped to group number 1. Note that you cannot specify the group number 0 (usually known as _NONE) from setsockopt() using NETLINK_ADD_MEMBERSHIP since this is rejected through -EINVAL. This problem became noticeable since 97840cb6 ("netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind") when binding to bitmask (1 << 0) in ctnetlink. Reported-by:
Andre Tomt <andre@tomt.net> Reported-by:
Ivan Delalande <colona@arista.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
David S. Miller <davem@davemloft.net> ipv4: Don't increase PMTU with Datagram Too Big message. RFC 1191 said, "a host MUST not increase its estimate of the Path MTU in response to the contents of a Datagram Too Big message." Signed-off-by:
Li Wei <lw@cn.fujitsu.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Merge branch 'arm-build-fixes' Arnd Bergmann says: ==================== net: driver fixes from arm randconfig builds These four patches are fallout from test builds on ARM. I have a few more of them in my backlog but have not yet confirmed them to still be valid. The first three patches are about incomplete dependencies on old drivers. One could backport them to the beginning of time in theory, but there is little value since nobody would run into these problems. The final patch is one I had submitted before together with the respective pcmcia patch but forgot to follow up on that. It's still a valid but relatively theoretical bug, because the previous behavior of the driver was just as broken as what we have in mainline. ==================== Signed-off-by:
David S. Miller <davem@davemloft.net> net: am2150: fix nmclan_cs.c shared interrupt handling A recent patch tried to work around a valid warning for the use of a deprecated interface by blindly changing from the old pcmcia_request_exclusive_irq() interface to pcmcia_request_irq(). This driver has an interrupt handler that is not currently aware of shared interrupts, but can be easily converted to be. At the moment, the driver reads the interrupt status register repeatedly until it contains only zeroes in the interesting bits, and handles each bit individually. This patch adds the missing part of returning IRQ_NONE in case none of the bits are set to start with, so we can move on to the next interrupt source. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Fixes: 5f5316fc ("am2150: Update nmclan_cs.c to use update PCMCIA API") Signed-off-by:
David S. Miller <davem@davemloft.net> net: lance,ni64: don't build for ARM The ni65 and lance ethernet drivers manually program the ISA DMA controller that is only available on x86 PCs and a few compatible systems. Trying to build it on ARM results in this error: ni65.c: In function 'ni65_probe1': ni65.c:496:62: error: 'DMA1_STAT_REG' undeclared (first use in this function) ((inb(DMA1_STAT_REG) >> 4) & 0x0f) ^ ni65.c:496:62: note: each undeclared identifier is reported only once for each function it appears in ni65.c:497:63: error: 'DMA2_STAT_REG' undeclared (first use in this function) | (inb(DMA2_STAT_REG) & 0xf0); The DMA1_STAT_REG and DMA2_STAT_REG registers are only defined for alpha, mips, parisc, powerpc and x86, although it is not clear which subarchitectures actually have them at the correct location. This patch for now just disables it for ARM, to avoid randconfig build errors. We could also decide to limit it to the set of architectures on which it does compile, but that might look more deliberate than guessing based on where the drivers build. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
David S. Miller <davem@davemloft.net> net: wan: add missing virt_to_bus dependencies The cosa driver is rather outdated and does not get built on most platforms because it requires the ISA_DMA_API symbol. However there are some ARM platforms that have ISA_DMA_API but no virt_to_bus, and they get this build error when enabling the ltpc driver. drivers/net/wan/cosa.c: In function 'tx_interrupt': drivers/net/wan/cosa.c:1768:3: error: implicit declaration of function 'virt_to_bus' unsigned long addr = virt_to_bus(cosa->txbuf); ^ The same problem exists for the Hostess SV-11 and Sealevel Systems 4021 drivers. This adds another dependency in Kconfig to avoid that configuration. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
David S. Miller <davem@davemloft.net> net: cs89x0: always build platform code if !HAS_IOPORT_MAP The cs89x0 driver can either be built as an ISA driver or a platform driver, the choice is controlled by the CS89x0_PLATFORM Kconfig symbol. Building the ISA driver on a system that does not have a way to map I/O ports fails with this error: drivers/built-in.o: In function `cs89x0_ioport_probe.constprop.1': :(.init.text+0x4794): undefined reference to `ioport_map' :(.init.text+0x4830): undefined reference to `ioport_unmap' This changes the Kconfig logic to take that option away and always force building the platform variant of this driver if CONFIG_HAS_IOPORT_MAP is not set. This is the only correct choice in this case, and it avoids the build error. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
David S. Miller <davem@davemloft.net> ppp: deflate: never return len larger than output buffer When we've run out of space in the output buffer to store more data, we will call zlib_deflate with a NULL output buffer until we've consumed remaining input. When this happens, olen contains the size the output buffer would have consumed iff we'd have had enough room. This can later cause skb_over_panic when ppp_generic skb_put()s the returned length. Reported-by:
Iain Douglas <centos@1n6.org.uk> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
David S. Miller <davem@davemloft.net> Merge branch 'netns' Nicolas Dichtel says: ==================== netns: audit netdevice creation with IFLA_NET_NS_[PID|FD] When one of these attributes is set, the netdevice is created into the netns pointed by IFLA_NET_NS_[PID|FD] (see the call to rtnl_create_link() in rtnl_newlink()). Let's call this netns the dest_net. After this creation, if the newlink handler exists, it is called with a netns argument that points to the netns where the netlink message has been received (called src_net in the code) which is the link netns. Hence, with one of these attributes, it's possible to create a x-netns netdevice. Here is the result of my code review: - all ip tunnels (sit, ipip, ip6_tunnels, gre[tap][v6], ip_vti[6]) does not really allows to use this feature: the netdevice is created in the dest_net and the src_net is completely ignored in the newlink handler. - VLAN properly handles this x-netns creation. - bridge ignores src_net, which seems fine (NETIF_F_NETNS_LOCAL is set). - CAIF subsystem is not clear for me (I don't know how it works), but it seems to wrongly use src_net. Patch #1 tries to fix this, but it was done only by code review (and only compile-tested), so please carefully review it. I may miss something. - HSR subsystem uses src_net to parse IFLA_HSR_SLAVE[1|2], but the netdevice has the flag NETIF_F_NETNS_LOCAL, so the question is: does this netdevice really supports x-netns? If not, the newlink handler should use the dest_net instead of src_net, I can provide the patch. - ieee802154 uses also src_net and does not have NETIF_F_NETNS_LOCAL. Same question: does this netdevice really supports x-netns? - bonding ignores src_net and flag NETIF_F_NETNS_LOCAL is set, ie x-netns is not supported. Fine. - CAN does not support rtnl/newlink, ok. - ipvlan uses src_net and does not have NETIF_F_NETNS_LOCAL. After looking at the code, it seems that this drivers support x-netns. Am I right? - macvlan/macvtap uses src_net and seems to have x-netns support. - team ignores src_net and has the flag NETIF_F_NETNS_LOCAL, ie x-netns is not supported. Ok. - veth uses src_net and have x-netns support ;-) Ok. - VXLAN didn't properly handle this. The link netns (vxlan->net) is the src_net and not dest_net (see patch #2). Note that it was already possible to create a x-netns vxlan before the commit f01ec1c0 ("vxlan: add x-netns support") but the nedevice remains broken. To summarize: - CAIF patch must be carefully reviewed - for HSR, ieee802154, ipvlan: is x-netns supported? ==================== Signed-off-by:
David S. Miller <davem@davemloft.net> vxlan: setup the right link netns in newlink hdlr Rename the netns to src_net to avoid confusion with the netns where the interface stands. The user may specify IFLA_NET_NS_[PID|FD] to create a x-netns netndevice: IFLA_NET_NS_[PID|FD] points to the netns where the netdevice stands and src_net to the link netns. Note that before commit f01ec1c0 ("vxlan: add x-netns support"), it was possible to create a x-netns vxlan netdevice, but the netdevice was not operational. Fixes: f01ec1c0 ("vxlan: add x-netns support") Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
David S. Miller <davem@davemloft.net> caif: remove wrong dev_net_set() call src_net points to the netns where the netlink message has been received. This netns may be different from the netns where the interface is created (because the user may add IFLA_NET_NS_[PID|FD]). In this case, src_net is the link netns. It seems wrong to override the netns in the newlink() handler because if it was not already src_net, it means that the user explicitly asks to create the netdevice in another netns. CC: Sjur Brændeland <sjur.brandeland@stericsson.com> CC: Dmitry Tarnyagin <dmitry.tarnyagin@lockless.no> Fixes: 8391c4aa ("caif: Bugfixes in CAIF netdevice for close and flow control") Fixes: c4125400 ("caif-hsi: Add rtnl support") Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
David S. Miller <davem@davemloft.net> lib/checksum.c: fix build for generic csum_tcpudp_nofold Fixed commit added from64to32 under _#ifndef do_csum_ but used it under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's robot reported TILEGX's). Move from64to32 under the latter. Fixes: 150ae0e9 ("lib/checksum.c: fix carry in csum_tcpudp_nofold") Reported-by:
kbuild test robot <fengguang.wu@intel.com> Signed-off-by:
Karl Beldan <karl.beldan@rivierawaves.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
David S. Miller <davem@davemloft.net> tcp: ipv4: initialize unicast_sock sk_pacing_rate When I added sk_pacing_rate field, I forgot to initialize its value in the per cpu unicast_sock used in ip_send_unicast_reply() This means that for sch_fq users, RST packets, or ACK packets sent on behalf of TIME_WAIT sockets might be sent to slowly or even dropped once we reach the per flow limit. Signed-off-by:
Eric Dumazet <edumazet@google.com> Fixes: 95bd09eb ("tcp: TSO packets automatic sizing") Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit bdbbb852 811230cd) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Roopa Prabhu authored
Reported in: https://bugzilla.kernel.org/show_bug.cgi?id=92081 This patch avoids calling rtnl_notify if the device ndo_bridge_getlink handler does not return any bytes in the skb. Alternately, the skb->len check can be moved inside rtnl_notify. For the bridge vlan case described in 92081, there is also a fix needed in bridge driver to generate a proper notification. Will fix that in subsequent patch. v2: rebase patch on net tree Signed-off-by:
Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 59ccaaaa) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
subashab@codeaurora.org authored
An exception is seen in ICMP ping receive path where the skb destructor sock_rfree() tries to access a freed socket. This happens because ping_rcv() releases socket reference with sock_put() and this internally frees up the socket. Later icmp_rcv() will try to free the skb and as part of this, skb destructor is called and which leads to a kernel panic as the socket is freed already in ping_rcv(). -->|exception -007|sk_mem_uncharge -007|sock_rfree -008|skb_release_head_state -009|skb_release_all -009|__kfree_skb -010|kfree_skb -011|icmp_rcv -012|ip_local_deliver_finish Fix this incorrect free by cloning this skb and processing this cloned skb instead. This patch was suggested by Eric Dumazet Signed-off-by:
Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit fc752f1f) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Herbert Xu authored
While working on rhashtable walking I noticed that the UDP diag dumping code is buggy. In particular, the socket skipping within a chain never happens, even though we record the number of sockets that should be skipped. As this code was supposedly copied from TCP, this patch does what TCP does and resets num before we walk a chain. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Acked-by:
Pavel Emelyanov <xemul@parallels.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 86f3cddb) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-