- 10 Apr, 2018 21 commits
-
-
Harald Freudenberger authored
This patch removes the old status calls which have been marked as deprecated since at least 2 years now. There is no known application or library relying on these ioctls any more. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Harald Freudenberger authored
The ap init functions ap_module_init and ap_debug_init are only used within ap_bus.c. Make these functions static and do not declare them in any header file any more. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
The linux.vnet.ibm.com domain will be discontinued end of 2018. Instead the new linux.ibm.com domain is already active. Reflect this by changing the email addresses of maintainers active in the s390 area accordingly. Acked-by: Benjamin Block <bblock@linux.ibm.com> Acked-by: Dong Jia Shi <bjsdjshi@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Jan Hoeppner <hoeppner@linux.ibm.com> Acked-by: Janosch Frank <frankja@linux.ibm.com> Acked-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com> Acked-by: Sebastian Ott <sebott@linux.ibm.com> Acked-by: Stefan Haberland <sth@linux.ibm.com> Acked-by: Steffen Maier <maier@linux.ibm.com> Acked-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
reipl_method and dump_method have been used in addition to reipl_type and dump_type, because a single reipl_type could be achieved with multiple reipl_method (same for dump_type/method). After dropping non-diag308_set based reipl methods, there is a single method per reipl_type/dump_type and reipl_method and dump_method could be simply removed. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
s390 kdump reipl implementation relies on os_info kernel structure residing in old memory being dumped. os_info contains reipl block, which is used (if valid) by the kdump kernel for reipl parameters. The problem is that the reipl block and its checksum inside os_info is updated only when /sys/firmware/reipl/reipl_type is written. This sets an offset of a reipl block for "reipl_type" and re-calculates reipl block checksum. Any further alteration of values under /sys/firmware/reipl/{reipl_type}/ without subsequent write to /sys/firmware/reipl/reipl_type lead to incorrect os_info reipl block checksum. In such a case kdump kernel ignores it and reboots using default logic. To fix this, os_info reipl block update is moved right before kdump execution. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
do_reipl, do_halt and do_poff are not defined anywhere. Cleaning up functions declaration. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
diag308 set has been available for many machine generations, and alternative reipl code paths has not been exercised and seems to be broken without noticing for a while now. So, cleaning up all obsolete reipl methods except currently used ones, assuming that diag308 set always works. Also removing not longer needed reset callbacks. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Add missing ipl parmblock validity check to append_ipl_scpdata. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
In some cases diag308_set_works used to be misused as "we have valid ipl parmblock", which is not the case when diag308 set works, but there is no ipl parmblock (diag308 store returns DIAG308_RC_NOCONFIG). Such checks are adjusted to reuse ipl_block_valid instead of diag308_set_works. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
For both ccw and fcp boot retrieve ipl info from ipl block received via diag308 store. Old scsi ipl parm block handling and cio_get_iplinfo are removed. Ipl type is deducted from ipl block (if valid). Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
ipl_flags and corresponding enum are not used outside of ipl.c and will be reworked in later commits. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
ipl_ssid and ipl_devno used to be used during ccw boot when diag308 store was not available. Reuse ipl_block to store those values. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
Ipl parm blocks received via "diag308 store" and during scsi boot at IPL_PARMBLOCK_ORIGIN are merged into the "ipl_block". Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Vasily Gorbik authored
When loadparm is set in reipl parm block, the kernel should also set DIAG308_FLAGS_LP_VALID flag. This fixes loadparm ignoring during z/VM fcp -> ccw reipl and kvm direct boot -> ccw reipl. Cc: <stable@vger.kernel.org> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Julian Wiedmann authored
During setup, qdio takes control of the presented ccw device and replaces the device's IRQ handler with its own. To avoid any interference with conccurent activity on the device, this should be done while holding the device's lock. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Julian Wiedmann authored
During shutdown, qdio returns its ccw device back to control by the upper-layer driver. But there is a remote chance that by the time where the IRQ handler gets switched back, the interrupt for the preceding ccw_device_{clear,halt} hasn't been presented yet. Upper-layer drivers would then need to handle this IRQ - and since the IO is issued with an intparm, it could very well be confused with whatever intparm mechanism the driver uses itself (eg intparm == request address). So when switching over the IRQ handler, also clear the intparm and have upper-layer drivers deal with any such delayed interrupt as if it was unsolicited. Suggested-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Julian Wiedmann authored
ccwgroup_create_dev() derives the gdev's device name from gdev->cdev[0], so make sure that this reference is valid. For robustness only, all current ccwgroup drivers get this right. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Harald Freudenberger authored
The AP bus code is not available as kernel module any more. There was some leftover code dealing with kernel module exit which has been removed with this patch. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
Git commit c60a03fe ("s390: switch to {get,put}_compat_sigset()") contains a typo and now copies the wrong pointer to user space. Use the correct pointer instead. Reported-and-tested-by: Stefan Liebler <stli@linux.vnet.ibm.com> Fixes: c60a03fe ("s390: switch to {get,put}_compat_sigset()") Cc: <stable@vger.kernel.org> # v4.15+ Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Harald Freudenberger authored
Tests with paes-xts and debugging investigations showed that the ciphers are not always correctly resolved. The rules for cipher priorities seem to be: - Ecb-aes should have a prio greater than the generic ecb-aes. - The mode specialized ciphers (like cbc-aes-s390) should have a prio greater than the sum of the more generic combinations (like cbs(aes)). This patch adjusts the cipher priorities for the s390 aes and paes in kernel crypto implementations. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) The sockmap code has to free socket memory on close if there is corked data, from John Fastabend. 2) Tunnel names coming from userspace need to be length validated. From Eric Dumazet. 3) arp_filter() has to take VRFs properly into account, from Miguel Fadon Perlines. 4) Fix oops in error path of tcf_bpf_init(), from Davide Caratti. 5) Missing idr_remove() in u32_delete_key(), from Cong Wang. 6) More syzbot stuff. Several use of uninitialized value fixes all over, from Eric Dumazet. 7) Do not leak kernel memory to userspace in sctp, also from Eric Dumazet. 8) Discard frames from unused ports in DSA, from Andrew Lunn. 9) Fix DMA mapping and reset/failover problems in ibmvnic, from Thomas Falcon. 10) Do not access dp83640 PHY registers prematurely after reset, from Esben Haabendal. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits) vhost-net: set packet weight of tx polling to 2 * vq size net: thunderx: rework mac addresses list to u64 array inetpeer: fix uninit-value in inet_getpeer dp83640: Ensure against premature access to PHY registers after reset devlink: convert occ_get op to separate registration ARM: dts: ls1021a: Specify TBIPA register address net/fsl_pq_mdio: Allow explicit speficition of TBIPA address ibmvnic: Do not reset CRQ for Mobility driver resets ibmvnic: Fix failover case for non-redundant configuration ibmvnic: Fix reset scheduler error handling ibmvnic: Zero used TX descriptor counter on reset ibmvnic: Fix DMA mapping mistakes tipc: use the right skb in tipc_sk_fill_sock_diag() sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6 net: dsa: Discard frames from unused ports sctp: do not leak kernel memory to user space soreuseport: initialise timewait reuseport field ipv4: fix uninit-value in ip_route_output_key_hash_rcu() dccp: initialize ireq->ir_mark net: fix uninit-value in __hw_addr_add_ex() ...
-
- 09 Apr, 2018 15 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs namei updates from Al Viro: - make lookup_one_len() safe with parent locked only shared(incoming afs series wants that) - fix of getname_kernel() regression from 2015 (-stable fodder, that one). * 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: getname_kernel() needs to make sure that ->name != ->iname in long case make lookup_one_len() safe to use with directory locked shared new helper: __lookup_slow() merge common parts of lookup_one_len{,_unlocked} into common helper
-
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linuxLinus Torvalds authored
Pull orangefs updates from Mike Marshall: "Fixes and cleanups: - Documentation cleanups - removal of unused code - make some structs static - implement Orangefs vm_operations fault callout - eliminate two single-use functions and put their cleaned up code in line. - replace a vmalloc/memset instance with vzalloc - fix a race condition bug in wait code" * tag 'for-linus-4.17-ofs' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: Orangefs: documentation updates orangefs: document package install and xfstests procedure orangefs: remove unused code orangefs: make several *_operations structs static orangefs: implement vm_ops->fault orangefs: open code short single-use functions orangefs: replace vmalloc and memset with vzalloc orangefs: bug fix for a race condition when getting a slot
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds authored
Pull pstore fix from Kees Cook: "Fix another compression Kconfig combination missed in testing (Tobias Regnery)" * tag 'pstore-v4.17-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore: fix crypto dependencies without compression
-
Stephen Smalley authored
Commit 0619f0f5 ("selinux: wrap selinuxfs state") triggers a BUG when SELinux is runtime-disabled (i.e. systemd or equivalent disables SELinux before initial policy load via /sys/fs/selinux/disable based on /etc/selinux/config SELINUX=disabled). This does not manifest if SELinux is disabled via kernel command line argument or if SELinux is enabled (permissive or enforcing). Before: SELinux: Disabled at runtime. BUG: Dentry 000000006d77e5c7{i=17,n=null} still in use (1) [unmount of selinuxfs selinuxfs] After: SELinux: Disabled at runtime. Fixes: 0619f0f5 ("selinux: wrap selinuxfs state") Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull kvm updates from Paolo Bonzini: "ARM: - VHE optimizations - EL2 address space randomization - speculative execution mitigations ("variant 3a", aka execution past invalid privilege register access) - bugfixes and cleanups PPC: - improvements for the radix page fault handler for HV KVM on POWER9 s390: - more kvm stat counters - virtio gpu plumbing - documentation - facilities improvements x86: - support for VMware magic I/O port and pseudo-PMCs - AMD pause loop exiting - support for AMD core performance extensions - support for synchronous register access - expose nVMX capabilities to userspace - support for Hyper-V signaling via eventfd - use Enlightened VMCS when running on Hyper-V - allow userspace to disable MWAIT/HLT/PAUSE vmexits - usual roundup of optimizations and nested virtualization bugfixes Generic: - API selftest infrastructure (though the only tests are for x86 as of now)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (174 commits) kvm: x86: fix a prototype warning kvm: selftests: add sync_regs_test kvm: selftests: add API testing infrastructure kvm: x86: fix a compile warning KVM: X86: Add Force Emulation Prefix for "emulate the next instruction" KVM: X86: Introduce handle_ud() KVM: vmx: unify adjacent #ifdefs x86: kvm: hide the unused 'cpu' variable KVM: VMX: remove bogus WARN_ON in handle_ept_misconfig Revert "KVM: X86: Fix SMRAM accessing even if VM is shutdown" kvm: Add emulation for movups/movupd KVM: VMX: raise internal error for exception during invalid protected mode state KVM: nVMX: Optimization: Dont set KVM_REQ_EVENT when VMExit with nested_run_pending KVM: nVMX: Require immediate-exit when event reinjected to L2 and L1 event pending KVM: x86: Fix misleading comments on handling pending exceptions KVM: x86: Rename interrupt.pending to interrupt.injected KVM: VMX: No need to clear pending NMI/interrupt on inject realmode interrupt x86/kvm: use Enlightened VMCS when running on Hyper-V x86/hyper-v: detect nested features x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits ...
-
Linus Torvalds authored
Commit 3c8ba0d6 ("kernel.h: Retain constant expression output for max()/min()") rewrote our min/max macros to be very clever, but in the meantime resurrected a variable name shadow issue that we had had previously fixed in commit 589a9785 ("min/max: remove sparse warnings when they're nested"). That commit talks about the sparse warnings that this shadowing causes, which we ignored as just a minor annoyance. But it turns out that the sparse warning is the least of our problems. We actually have a real bug due to the shadowing through the interaction with "min_not_zero()", which ends up doing min(__x, __y) internally, and then the new declaration of "__x" and "__y" as new variables in __cmp_once() results in a complete mess of an expression, and "min_not_zero()" doesn't work at all. For some odd reason, this only ever caused (reported) problems on s390, even though it is a generic issue and most of the (obviously successful) testing of the problematic commit had happened on other architectures. Quoting Sebastian Ott: "What happened is that the bio build by the partition detection code was attempted to be split by the block layer because the block queue had a max_sector setting of 0. blk_queue_max_hw_sectors uses min_not_zero." So re-introduce the use of __UNIQUE_ID() to make sure that the min/max macros do not have these kinds of clashes. [ That said, __UNIQUE_ID() itself has several issues that make it less than wonderful. In particular, the "uniqueness" has a fallback on the line number, which means that it's not actually unique in more complex cases if you don't build with gcc or clang (which have working unique counters that aren't tied to line numbers). That historical broken fallback also means that we have that pointless "prefix" argument that doesn't actually make much sense _except_ for the known-broken case. Oh well. ] Fixes: 3c8ba0d6 ("kernel.h: Retain constant expression output for max()/min()") Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds authored
Pull ARM SA1100 updates from Russell King: "We have support for arbitary MMIO registers providing platform GPIOs, which allows us to abstract some of the SA11x0 CF support. This set of updates makes that change" * 'for-linus-sa1100' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: sa1100/simpad: switch simpad CF to use gpiod APIs ARM: sa1100/shannon: convert to generic CF sockets ARM: sa1100/nanoengine: convert to generic CF sockets ARM: sa1100/h3xxx: switch h3xxx PCMCIA to use gpiod APIs ARM: sa1100/cerf: convert to generic CF sockets ARM: sa1100/assabet: convert to generic CF sockets ARM: sa1100: provide infrastructure to support generic CF sockets pcmcia: sa1100: provide generic CF support
-
git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds authored
Pull ARM updates from Russell King: "A number of core ARM changes: - Refactoring linker script by Nicolas Pitre - Enable source fortification - Add support for Cortex R8" * 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: decompressor: fix warning introduced in fortify patch ARM: 8751/1: Add support for Cortex-R8 processor ARM: 8749/1: Kconfig: Add ARCH_HAS_FORTIFY_SOURCE ARM: simplify and fix linker script for TCM ARM: linker script: factor out TCM bits ARM: linker script: factor out vectors and stubs ARM: linker script: factor out unwinding table sections ARM: linker script: factor out stuff for the .text section ARM: linker script: factor out stuff for the DISCARD section ARM: linker script: factor out some common definitions between XIP and non-XIP
-
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommuLinus Torvalds authored
Pull m68knommu update from Greg Ungerer: "Only a single fix to set the DMA masks in the ColdFire FEC platform data structure. This stops the warning from dma-mapping.h at boot time" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: set dma and coherent masks for platform FEC ethernets
-
git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alphaLinus Torvalds authored
Pull alpha updates from Matt Turner: "A few small changes for alpha" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha: alpha: io: reorder barriers to guarantee writeX() and iowriteX() ordering alpha: Implement CPU vulnerabilities sysfs functions. alpha: rtc: stop validating rtc_time in .read_time alpha: rtc: remove unused set_mmss ops
-
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds authored
Pull s390 updates from Martin Schwidefsky: - Improvements for the spectre defense: * The spectre related code is consolidated to a single file nospec-branch.c * Automatic enable/disable for the spectre v2 defenses (expoline vs. nobp) * Syslog messages for specve v2 are added * Enable CONFIG_GENERIC_CPU_VULNERABILITIES and define the attribute functions for spectre v1 and v2 - Add helper macros for assembler alternatives and use them to shorten the code in entry.S. - Add support for persistent configuration data via the SCLP Store Data interface. The H/W interface requires a page table that uses 4K pages only, the code to setup such an address space is added as well. - Enable virtio GPU emulation in QEMU. To do this the depends statements for a few common Kconfig options are modified. - Add support for format-3 channel path descriptors and add a binary sysfs interface to export the associated utility strings. - Add a sysfs attribute to control the IFCC handling in case of constant channel errors. - The vfio-ccw changes from Cornelia. - Bug fixes and cleanups. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (40 commits) s390/kvm: improve stack frame constants in entry.S s390/lpp: use assembler alternatives for the LPP instruction s390/entry.S: use assembler alternatives s390: add assembler macros for CPU alternatives s390: add sysfs attributes for spectre s390: report spectre mitigation via syslog s390: add automatic detection of the spectre defense s390: move nobp parameter functions to nospec-branch.c s390/cio: add util_string sysfs attribute s390/chsc: query utility strings via fmt3 channel path descriptor s390/cio: rename struct channel_path_desc s390/cio: fix unbind of io_subchannel_driver s390/qdio: split up CCQ handling for EQBS / SQBS s390/qdio: don't retry EQBS after CCQ 96 s390/qdio: restrict buffer merging to eligible devices s390/qdio: don't merge ERROR output buffers s390/qdio: simplify math in get_*_buffer_frontier() s390/decompressor: trim uncompressed image head during the build s390/crypto: Fix kernel crash on aes_s390 module remove. s390/defkeymap: fix global init to zero ...
-
haibinzhang(张海斌) authored
handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy polling udp packets with small length(e.g. 1byte udp payload), because setting VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet length. Ping-Latencies shown below were tested between two Virtual Machines using netperf (UDP_STREAM, len=1), and then another machine pinged the client: vq size=256 Packet-Weight Ping-Latencies(millisecond) min avg max Origin 3.319 18.489 57.303 64 1.643 2.021 2.552 128 1.825 2.600 3.224 256 1.997 2.710 4.295 512 1.860 3.171 4.631 1024 2.002 4.173 9.056 2048 2.257 5.650 9.688 4096 2.093 8.508 15.943 vq size=512 Packet-Weight Ping-Latencies(millisecond) min avg max Origin 6.537 29.177 66.245 64 2.798 3.614 4.403 128 2.861 3.820 4.775 256 3.008 4.018 4.807 512 3.254 4.523 5.824 1024 3.079 5.335 7.747 2048 3.944 8.201 12.762 4096 4.158 11.057 19.985 Seems pretty consistent, a small dip at 2 VQ sizes. Ring size is a hint from device about a burst size it can tolerate. Based on benchmarks, set the weight to 2 * vq size. To evaluate this change, another tests were done using netperf(RR, TX) between two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was tweaked through qemu. Results shown below does not show obvious changes. vq size=256 TCP_RR vq size=512 TCP_RR size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% 1/ 1/ -7%/ -2% 1/ 1/ 0%/ -2% 1/ 4/ +1%/ 0% 1/ 4/ +1%/ 0% 1/ 8/ +1%/ -2% 1/ 8/ 0%/ +1% 64/ 1/ -6%/ 0% 64/ 1/ +7%/ +3% 64/ 4/ 0%/ +2% 64/ 4/ -1%/ +1% 64/ 8/ 0%/ 0% 64/ 8/ -1%/ -2% 256/ 1/ -3%/ -4% 256/ 1/ -4%/ -2% 256/ 4/ +3%/ +4% 256/ 4/ +1%/ +2% 256/ 8/ +2%/ 0% 256/ 8/ +1%/ -1% vq size=256 UDP_RR vq size=512 UDP_RR size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% 1/ 1/ -5%/ +1% 1/ 1/ -3%/ -2% 1/ 4/ +4%/ +1% 1/ 4/ -2%/ +2% 1/ 8/ -1%/ -1% 1/ 8/ -1%/ 0% 64/ 1/ -2%/ -3% 64/ 1/ +1%/ +1% 64/ 4/ -5%/ -1% 64/ 4/ +2%/ 0% 64/ 8/ 0%/ -1% 64/ 8/ -2%/ +1% 256/ 1/ +7%/ +1% 256/ 1/ -7%/ 0% 256/ 4/ +1%/ +1% 256/ 4/ -3%/ -4% 256/ 8/ +2%/ +2% 256/ 8/ +1%/ +1% vq size=256 TCP_STREAM vq size=512 TCP_STREAM size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% 64/ 1/ 0%/ -3% 64/ 1/ 0%/ 0% 64/ 4/ +3%/ -1% 64/ 4/ -2%/ +4% 64/ 8/ +9%/ -4% 64/ 8/ -1%/ +2% 256/ 1/ +1%/ -4% 256/ 1/ +1%/ +1% 256/ 4/ -1%/ -1% 256/ 4/ -3%/ 0% 256/ 8/ +7%/ +5% 256/ 8/ -3%/ 0% 512/ 1/ +1%/ 0% 512/ 1/ -1%/ -1% 512/ 4/ +1%/ -1% 512/ 4/ 0%/ 0% 512/ 8/ +7%/ -5% 512/ 8/ +6%/ -1% 1024/ 1/ 0%/ -1% 1024/ 1/ 0%/ +1% 1024/ 4/ +3%/ 0% 1024/ 4/ +1%/ 0% 1024/ 8/ +8%/ +5% 1024/ 8/ -1%/ 0% 2048/ 1/ +2%/ +2% 2048/ 1/ -1%/ 0% 2048/ 4/ +1%/ 0% 2048/ 4/ 0%/ -1% 2048/ 8/ -2%/ 0% 2048/ 8/ 5%/ -1% 4096/ 1/ -2%/ 0% 4096/ 1/ -2%/ 0% 4096/ 4/ +2%/ 0% 4096/ 4/ 0%/ 0% 4096/ 8/ +9%/ -2% 4096/ 8/ -5%/ -1% Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Haibin Zhang <haibinzhang@tencent.com> Signed-off-by: Yunfang Tai <yunfangtai@tencent.com> Signed-off-by: Lidong Chen <lidongchen@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vadim Lomovtsev authored
It is too expensive to pass u64 values via linked list, instead allocate array for them by overall number of mac addresses from netdev. This eventually removes multiple kmalloc() calls, aviod memory fragmentation and allow to put single null check on kmalloc return value in order to prevent a potential null pointer dereference. Addresses-Coverity-ID: 1467429 ("Dereference null return value") Fixes: 37c3347e ("net: thunderx: add ndo_set_rx_mode callback implementation for VF") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
syzbot/KMSAN reported that p->dtime was read while it was not yet initialized in : delta = (__u32)jiffies - p->dtime; if (delta < ttl || !refcount_dec_if_one(&p->refcnt)) gc_stack[i] = NULL; This is a false positive, because the inetpeer wont be erased from rb-tree if the refcount_dec_if_one(&p->refcnt) does not succeed. And this wont happen before first inet_putpeer() call for this inetpeer has been done, and ->dtime field is written exactly before the refcount_dec_and_test(&p->refcnt). The KMSAN report was : BUG: KMSAN: uninit-value in inet_peer_gc net/ipv4/inetpeer.c:163 [inline] BUG: KMSAN: uninit-value in inet_getpeer+0x1567/0x1e70 net/ipv4/inetpeer.c:228 CPU: 0 PID: 9494 Comm: syz-executor5 Not tainted 4.16.0+ #82 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 inet_peer_gc net/ipv4/inetpeer.c:163 [inline] inet_getpeer+0x1567/0x1e70 net/ipv4/inetpeer.c:228 inet_getpeer_v4 include/net/inetpeer.h:110 [inline] icmpv4_xrlim_allow net/ipv4/icmp.c:330 [inline] icmp_send+0x2b44/0x3050 net/ipv4/icmp.c:725 ip_options_compile+0x237c/0x29f0 net/ipv4/ip_options.c:472 ip_rcv_options net/ipv4/ip_input.c:284 [inline] ip_rcv_finish+0xda8/0x16d0 net/ipv4/ip_input.c:365 NF_HOOK include/linux/netfilter.h:288 [inline] ip_rcv+0x119d/0x16f0 net/ipv4/ip_input.c:493 __netif_receive_skb_core+0x47cf/0x4a80 net/core/dev.c:4562 __netif_receive_skb net/core/dev.c:4627 [inline] netif_receive_skb_internal+0x49d/0x630 net/core/dev.c:4701 netif_receive_skb+0x230/0x240 net/core/dev.c:4725 tun_rx_batched drivers/net/tun.c:1555 [inline] tun_get_user+0x6d88/0x7580 drivers/net/tun.c:1962 tun_chr_write_iter+0x1d4/0x330 drivers/net/tun.c:1990 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776 do_iter_write+0x30d/0xd40 fs/read_write.c:932 vfs_writev fs/read_write.c:977 [inline] do_writev+0x3c9/0x830 fs/read_write.c:1012 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085 SyS_writev+0x56/0x80 fs/read_write.c:1082 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x455111 RSP: 002b:00007fae0365cba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 000000000000002e RCX: 0000000000455111 RDX: 0000000000000001 RSI: 00007fae0365cbf0 RDI: 00000000000000fc RBP: 0000000020000040 R08: 00000000000000fc R09: 0000000000000000 R10: 000000000000002e R11: 0000000000000293 R12: 00000000ffffffff R13: 0000000000000658 R14: 00000000006fc8e0 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmem_cache_alloc+0xaab/0xb90 mm/slub.c:2756 inet_getpeer+0xed8/0x1e70 net/ipv4/inetpeer.c:210 inet_getpeer_v4 include/net/inetpeer.h:110 [inline] ip4_frag_init+0x4d1/0x740 net/ipv4/ip_fragment.c:153 inet_frag_alloc net/ipv4/inet_fragment.c:369 [inline] inet_frag_create net/ipv4/inet_fragment.c:385 [inline] inet_frag_find+0x7da/0x1610 net/ipv4/inet_fragment.c:418 ip_find net/ipv4/ip_fragment.c:275 [inline] ip_defrag+0x448/0x67a0 net/ipv4/ip_fragment.c:676 ip_check_defrag+0x775/0xda0 net/ipv4/ip_fragment.c:724 packet_rcv_fanout+0x2a8/0x8d0 net/packet/af_packet.c:1447 deliver_skb net/core/dev.c:1897 [inline] deliver_ptype_list_skb net/core/dev.c:1912 [inline] __netif_receive_skb_core+0x314a/0x4a80 net/core/dev.c:4545 __netif_receive_skb net/core/dev.c:4627 [inline] netif_receive_skb_internal+0x49d/0x630 net/core/dev.c:4701 netif_receive_skb+0x230/0x240 net/core/dev.c:4725 tun_rx_batched drivers/net/tun.c:1555 [inline] tun_get_user+0x6d88/0x7580 drivers/net/tun.c:1962 tun_chr_write_iter+0x1d4/0x330 drivers/net/tun.c:1990 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776 do_iter_write+0x30d/0xd40 fs/read_write.c:932 vfs_writev fs/read_write.c:977 [inline] do_writev+0x3c9/0x830 fs/read_write.c:1012 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085 SyS_writev+0x56/0x80 fs/read_write.c:1082 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King authored
-
- 08 Apr, 2018 4 commits
-
-
Esben Haabendal authored
The datasheet specifies a 3uS pause after performing a software reset. The default implementation of genphy_soft_reset() does not provide this, so implement soft_reset with the needed pause. Signed-off-by: Esben Haabendal <eha@deif.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller authored
Daniel Borkmann says: ==================== pull-request: bpf 2018-04-09 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Two sockmap fixes: i) fix a potential warning when a socket with pending cork data is closed by freeing the memory right when the socket is closed instead of seeing still outstanding memory at garbage collector time, ii) fix a NULL pointer deref in case of duplicates release calls, so make sure to only reset the sk_prot pointer when it's in a valid state to do so, both from John. 2) Fix a compilation warning in bpf_prog_attach_check_attach_type() by moving the function under CONFIG_CGROUP_BPF ifdef since only used there, from Anders. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetoothDavid S. Miller authored
Johan Hedberg says: ==================== pull request: bluetooth 2018-04-08 Here's one important Bluetooth fix for the 4.17-rc series that's needed to pass several Bluetooth qualification test cases. Let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
This resolves race during initialization where the resources with ops are registered before driver and the structures used by occ_get op is initialized. So keep occ_get callbacks registered only when all structs are initialized. The example flows, as it is in mlxsw: 1) driver load/asic probe: mlxsw_core -> mlxsw_sp_resources_register -> mlxsw_sp_kvdl_resources_register -> devlink_resource_register IDX mlxsw_spectrum -> mlxsw_sp_kvdl_init -> mlxsw_sp_kvdl_parts_init -> mlxsw_sp_kvdl_part_init -> devlink_resource_size_get IDX (to get the current setup size from devlink) -> devlink_resource_occ_get_register IDX (register current occupancy getter) 2) reload triggered by devlink command: -> mlxsw_devlink_core_bus_device_reload -> mlxsw_sp_fini -> mlxsw_sp_kvdl_fini -> devlink_resource_occ_get_unregister IDX (struct mlxsw_sp *mlxsw_sp is freed at this point, call to occ get which is using mlxsw_sp would cause use-after free) -> mlxsw_sp_init -> mlxsw_sp_kvdl_init -> mlxsw_sp_kvdl_parts_init -> mlxsw_sp_kvdl_part_init -> devlink_resource_size_get IDX (to get the current setup size from devlink) -> devlink_resource_occ_get_register IDX (register current occupancy getter) Fixes: d9f9b9a4 ("devlink: Add support for resource abstraction") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-