1. 07 Jun, 2019 12 commits
    • Jan Glauber's avatar
      lockref: Limit number of cmpxchg loop retries · 893a7d32
      Jan Glauber authored
      The lockref cmpxchg loop is unbound as long as the spinlock is not
      taken. Depending on the hardware implementation of compare-and-swap
      a high number of loop retries might happen.
      
      Add an upper bound to the loop to force the fallback to spinlocks
      after some time. A retry value of 100 should not impact any hardware
      that does not have this issue.
      
      With the retry limit the performance of an open-close testcase
      improved between 60-70% on ThunderX2.
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJan Glauber <jglauber@marvell.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      893a7d32
    • Andrey Konovalov's avatar
      uaccess: add noop untagged_addr definition · d9344522
      Andrey Konovalov authored
      Architectures that support memory tagging have a need to perform untagging
      (stripping the tag) in various parts of the kernel. This patch adds an
      untagged_addr() macro, which is defined as noop for architectures that do
      not support memory tagging. The oncoming patch series will define it at
      least for sparc64 and arm64.
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: default avatarKhalid Aziz <khalid.aziz@oracle.com>
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d9344522
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20190607' of git://github.com/jcmvbkbc/linux-xtensa · d18c7e9d
      Linus Torvalds authored
      Pull xtensa fix from Max Filippov:
       "Fix a section mismatch between memblock_reserve and mem_reserve.
      
        This fixes tinyconfig xtensa builds"
      
      * tag 'xtensa-20190607' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: Fix section mismatch between memblock_reserve and mem_reserve
      d18c7e9d
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.2-2' of... · 33de0d1c
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull more Kbuild fixes from Masahiro Yamada:
      
       - fix kselftest-merge to find config fragments in deeper directories
      
       - fix kconfig unit test, which was broken by SPDX tag addition
      
       - add + prefix to buildtar to suppress jobserver unavailable warning
      
       - fix checkstack.pl to recognize arch=arm64
      
       - suppress noisy warning from cc-cross-prefix
      
      * tag 'kbuild-fixes-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: use more portable 'command -v' for cc-cross-prefix
        scripts/checkstack.pl: Fix arm64 wrong or unknown architecture
        kbuild: tar-pkg: enable communication with jobserver
        kconfig: tests: fix recursive inclusion unit test
        kbuild: teach kselftest-merge to find nested config files
      33de0d1c
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 91f152e7
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "Here's a couple of MMC and MEMSTICK fixes:
      
        MMC host:
         - sdhci: Fix SDIO IRQ thread deadlock
         - sdhci-tegra: Fix a warning message
         - sdhci_am654: Fix SLOTTYPE write
         - meson-gx: Fix IRQ ack
         - tmio: Fix SCC error handling to avoid false positive CRC error
      
        MEMSTICK core:
         - mspro_block: Fix returning a correct error code"
      
      * tag 'mmc-v5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: sdhci_am654: Fix SLOTTYPE write
        mmc: sdhci: Fix SDIO IRQ thread deadlock
        mmc: meson-gx: fix irq ack
        mmc: tmio: fix SCC error handling to avoid false positive CRC error
        mmc: tegra: Fix a warning message
        memstick: mspro_block: Fix an error code in mspro_block_issue_req()
      91f152e7
    • Linus Torvalds's avatar
      Merge tag 'pm-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · a373ec23
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix a crash during resume from hibernation introduced during the
        4.19 cycle, cause the new Performance and Energy Bias Hint (EPB) code
        to be built only if CONFIG_PM is set and add a few missing kerneldoc
        comments.
      
        Specifics:
      
         - Fix a crash that occurs when a kernel with 'nosmt' in the command
           line is used to resume the system from hibernation (as the
           "restore" kernel), because memory mapping differences between the
           restore and image kernels cause SMT siblings to be woken up from
           idle states and subsequently they try to fetch instructions from
           incorrect memory locations (Jiri Kosina).
      
         - Cause the new Performance and Energy Bias Hint (EPB) code to be
           built only if CONFIG_PM is set, because that code is not really
           necessary otherwise (Rafael Wysocki).
      
         - Add kerneldoc comments to documents some helper functions related
           to system-wide suspend to avoid possible confusion regarding their
           purpose (Rafael Wysocki)"
      
      * tag 'pm-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        x86/power: Fix 'nosmt' vs hibernation triple fault during resume
        PM: sleep: Add kerneldoc comments to some functions
        x86: intel_epb: Do not build when CONFIG_PM is unset
      a373ec23
    • Jann Horn's avatar
      x86/insn-eval: Fix use-after-free access to LDT entry · de9f8696
      Jann Horn authored
      get_desc() computes a pointer into the LDT while holding a lock that
      protects the LDT from being freed, but then drops the lock and returns the
      (now potentially dangling) pointer to its caller.
      
      Fix it by giving the caller a copy of the LDT entry instead.
      
      Fixes: 670f928b ("x86/insn-eval: Add utility function to get segment descriptor")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de9f8696
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 1e1d9263
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Free AF_PACKET po->rollover properly, from Willem de Bruijn.
      
       2) Read SFP eeprom in max 16 byte increments to avoid problems with
          some SFP modules, from Russell King.
      
       3) Fix UDP socket lookup wrt. VRF, from Tim Beale.
      
       4) Handle route invalidation properly in s390 qeth driver, from Julian
          Wiedmann.
      
       5) Memory leak on unload in RDS, from Zhu Yanjun.
      
       6) sctp_process_init leak, from Neil HOrman.
      
       7) Fix fib_rules rule insertion semantic change that broke Android,
          from Hangbin Liu.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
        pktgen: do not sleep with the thread lock held.
        net: mvpp2: Use strscpy to handle stat strings
        net: rds: fix memory leak in rds_ib_flush_mr_pool
        ipv6: fix EFAULT on sendto with icmpv6 and hdrincl
        ipv6: use READ_ONCE() for inet->hdrincl as in ipv4
        Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied"
        net: aquantia: fix wol configuration not applied sometimes
        ethtool: fix potential userspace buffer overflow
        Fix memory leak in sctp_process_init
        net: rds: fix memory leak when unload rds_rdma
        ipv6: fix the check before getting the cookie in rt6_get_cookie
        ipv4: not do cache for local delivery if bc_forwarding is enabled
        s390/qeth: handle error when updating TX queue count
        s390/qeth: fix VLAN attribute in bridge_hostnotify udev event
        s390/qeth: check dst entry before use
        s390/qeth: handle limited IPv4 broadcast in L3 TX path
        net: fix indirect calls helpers for ptype list hooks.
        net: ipvlan: Fix ipvlan device tso disabled while NETIF_F_IP_CSUM is set
        udp: only choose unbound UDP socket for multicast when not in a VRF
        net/tls: replace the sleeping lock around RX resync with a bit lock
        ...
      1e1d9263
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 6e38335d
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "Things are looking pretty quiet here in RDMA, not too many bug fixes
        rolling in right now. The usual driver bug fixes and fixes for a
        couple of regressions introduced in 5.2:
      
         - Fix a race on bootup with RDMA device renaming and srp. SRP also
           needs to rename its internal sys files
      
         - Fix a memory leak in hns
      
         - Don't leak resources in efa on certain error unwinds
      
         - Don't panic in certain error unwinds in ib_register_device
      
         - Various small user visible bug fix patches for the hfi and efa
           drivers
      
         - Fix the 32 bit compilation break"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/efa: Remove MAYEXEC flag check from mmap flow
        mlx5: avoid 64-bit division
        IB/hfi1: Validate page aligned for a given virtual address
        IB/{qib, hfi1, rdmavt}: Correct ibv_devinfo max_mr value
        IB/hfi1: Insure freeze_work work_struct is canceled on shutdown
        IB/rdmavt: Fix alloc_qpn() WARN_ON()
        RDMA/core: Fix panic when port_data isn't initialized
        RDMA/uverbs: Pass udata on uverbs error unwind
        RDMA/core: Clear out the udata before error unwind
        RDMA/hns: Fix PD memory leak for internal allocation
        RDMA/srp: Rename SRP sysfs name after IB device rename trigger
      6e38335d
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · a02a532c
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Another round of mostly-benign fixes, the exception being a boot crash
        on SVE2-capable CPUs (although I don't know where you'd find such a
        thing, so maybe it's benign too).
      
        We're in the process of resolving some big-endian ptrace breakage, so
        I'll probably have some more for you next week.
      
        Summary:
      
         - Fix boot crash on platforms with SVE2 due to missing register
           encoding
      
         - Fix architected timer accessors when CONFIG_OPTIMIZE_INLINING=y
      
         - Move cpu_logical_map into smp.h for use by upcoming irqchip drivers
      
         - Trivial typo fix in comment
      
         - Disable some useless, noisy warnings from GCC 9"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: Silence gcc warnings about arch ABI drift
        ARM64: trivial: s/TIF_SECOMP/TIF_SECCOMP/ comment typo fix
        arm64: arch_timer: mark functions as __always_inline
        arm64: smp: Moved cpu_logical_map[] to smp.h
        arm64: cpufeature: Fix missing ZFR0 in __read_sysreg_by_encoding()
      a02a532c
    • Masahiro Yamada's avatar
      kbuild: use more portable 'command -v' for cc-cross-prefix · 913ab978
      Masahiro Yamada authored
      To print the pathname that will be used by shell in the current
      environment, 'command -v' is a standardized way. [1]
      
      'which' is also often used in scripts, but it is less portable.
      
      When I worked on commit bd55f96f ("kbuild: refactor cc-cross-prefix
      implementation"), I was eager to use 'command -v' but it did not work.
      (The reason is explained below.)
      
      I kept 'which' as before but got rid of '> /dev/null 2>&1' as I
      thought it was no longer needed. Sorry, I was wrong.
      
      It works well on my Ubuntu machine, but Alexey Brodkin reports noisy
      warnings on CentOS7 when 'which' fails to find the given command in
      the PATH environment.
      
        $ which foo
        which: no foo in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)
      
      Given that behavior of 'which' depends on system (and it may not be
      installed by default), I want to try 'command -v' once again.
      
      The specification [1] clearly describes the behavior of 'command -v'
      when the given command is not found:
      
        Otherwise, no output shall be written and the exit status shall reflect
        that the name was not found.
      
      However, we need a little magic to use 'command -v' from Make.
      
      $(shell ...) passes the argument to a subshell for execution, and
      returns the standard output of the command.
      
      Here is a trick. GNU Make may optimize this by executing the command
      directly instead of forking a subshell, if no shell special characters
      are found in the command and omitting the subshell will not change the
      behavior.
      
      In this case, no shell special character is used. So, Make will try
      to run it directly. However, 'command' is a shell-builtin command,
      then Make would fail to find it in the PATH environment:
      
        $ make ARCH=m68k defconfig
        make: command: Command not found
        make: command: Command not found
        make: command: Command not found
      
      In fact, Make has a table of shell-builtin commands because it must
      ask the shell to execute them.
      
      Until recently, 'command' was missing in the table.
      
      This issue was fixed by the following commit:
      
      | commit 1af314465e5dfe3e8baa839a32a72e83c04f26ef
      | Author: Paul Smith <psmith@gnu.org>
      | Date:   Sun Nov 12 18:10:28 2017 -0500
      |
      |     * job.c: Add "command" as a known shell built-in.
      |
      |     This is not a POSIX shell built-in but it's common in UNIX shells.
      |     Reported by Nick Bowler <nbowler@draconx.ca>.
      
      Because the latest release is GNU Make 4.2.1 in 2016, this commit is
      not included in any released versions. (But some distributions may
      have back-ported it.)
      
      We need to trick Make to spawn a subshell. There are various ways to
      do so:
      
       1) Use a shell special character '~' as dummy
      
          $(shell : ~; command -v $(c)gcc)
      
       2) Use a variable reference that always expands to the empty string
          (suggested by David Laight)
      
          $(shell command$${x:+} -v $(c)gcc)
      
       3) Use redirect
      
          $(shell command -v $(c)gcc 2>/dev/null)
      
      I chose 3) to not confuse people. The stderr would not be polluted
      anyway, but it will provide extra safety, and is easy to understand.
      
      Tested on Make 3.81, 3.82, 4.0, 4.1, 4.2, 4.2.1
      
      [1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/command.html
      
      Fixes: bd55f96f ("kbuild: refactor cc-cross-prefix implementation")
      Cc: linux-stable <stable@vger.kernel.org> # 5.1
      Reported-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Tested-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      913ab978
    • Rafael J. Wysocki's avatar
      Merge branch 'pm-x86' · a964d23c
      Rafael J. Wysocki authored
      * pm-x86:
        x86/power: Fix 'nosmt' vs hibernation triple fault during resume
        x86: intel_epb: Do not build when CONFIG_PM is unset
      a964d23c
  2. 06 Jun, 2019 27 commits
    • Linus Torvalds's avatar
      Merge branch 'parisc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 16d72dd4
      Linus Torvalds authored
      Pull parisc fixes from Helge Deller:
      
       - Fix crashes when accessing PCI devices on some machines like C240 and
         J5000. The crashes were triggered because we replaced cache flushes
         by nops in the alternative coding where we shouldn't for some
         machines.
      
       - Dave fixed a race in the usage of the sr1 space register when used to
         load the coherence index.
      
       - Use the hardware lpa instruction to to load the physical address of
         kernel virtual addresses in the iommu driver code.
      
       - The kernel may fail to link when CONFIG_MLONGCALLS isn't set. Solve
         that by rearranging functions in the final vmlinux executeable.
      
       - Some defconfig cleanups and removal of compiler warnings.
      
      * 'parisc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Fix crash due alternative coding for NP iopdir_fdc bit
        parisc: Use lpa instruction to load physical addresses in driver code
        parisc: configs: Remove useless UEVENT_HELPER_PATH
        parisc: Use implicit space register selection for loading the coherence index of I/O pdirs
        parisc: Fix compiler warnings in float emulation code
        parisc/slab: cleanup after /proc/slab_allocators removal
        parisc: Allow building 64-bit kernel without -mlong-calls compiler option
        parisc: Kconfig: remove ARCH_DISCARD_MEMBLOCK
      16d72dd4
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · ae876604
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
       "This fixes a regression that breaks the jitterentropy RNG and a
        potential memory leak in hmac"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: hmac - fix memory leak in hmac_init_tfm()
        crypto: jitterentropy - change back to module_init()
      ae876604
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.2-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 01047631
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "Here are a couple more bug fixes for 5.2. Changes since last update:
      
         - Fix some forgotten strings in a log debugging function
      
         - Fix incorrect unit conversion in online fsck code"
      
      * tag 'xfs-5.2-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: inode btree scrubber should calculate im_boffset correctly
        xfs: fix broken log reservation debugging
      01047631
    • Linus Torvalds's avatar
      Merge tag 'gfs2-v5.2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · dc8ca9cc
      Linus Torvalds authored
      Pull gfs2 fix from Andreas Gruenbacher:
       "A revert for a patch that turned out to be broken"
      
      * tag 'gfs2-v5.2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
        Revert "gfs2: Replace gl_revokes with a GLF flag"
      dc8ca9cc
    • Linus Torvalds's avatar
      Merge tag 'ovl-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 5d6b501f
      Linus Torvalds authored
      Pull overlayfs fixes from Miklos Szeredi:
       "Here's one fix for a class of bugs triggered by syzcaller, and one
        that makes xfstests fail less"
      
      * tag 'ovl-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: doc: add non-standard corner cases
        ovl: detect overlapping layers
        ovl: support the FS_IOC_FS[SG]ETXATTR ioctls
      5d6b501f
    • Linus Torvalds's avatar
      Merge tag 'fuse-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 21175857
      Linus Torvalds authored
      Pull fuse fixes from Miklos Szeredi:
       "This fixes a leaked inode lock in an error cleanup path and a data
        consistency issue with copy_file_range().
      
        It also adds a new flag for the WRITE request that allows userspace
        filesystems to clear suid/sgid bits on the file if necessary"
      
      * tag 'fuse-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: extract helper for range writeback
        fuse: fix copy_file_range() in the writeback case
        fuse: add FUSE_WRITE_KILL_PRIV
        fuse: fallocate: fix return with locked inode
      21175857
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.2-2' of git://git.linux-nfs.org/projects/anna/linux-nfs · 459aa077
      Linus Torvalds authored
      Pull NFS client fixes from Anna Schumaker:
       "These are mostly stable bugfixes found during testing, many during the
        recent NFS bake-a-thon.
      
        Stable bugfixes:
         - SUNRPC: Fix regression in umount of a secure mount
         - SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential
         - NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
         - NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled
      
        Other bugfixes:
         - xprtrdma: Use struct_size() in kzalloc()"
      
      * tag 'nfs-for-5.2-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled
        NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
        SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential
        SUNRPC fix regression in umount of a secure mount
        xprtrdma: Use struct_size() in kzalloc()
      459aa077
    • Paolo Abeni's avatar
      pktgen: do not sleep with the thread lock held. · 720f1de4
      Paolo Abeni authored
      Currently, the process issuing a "start" command on the pktgen procfs
      interface, acquires the pktgen thread lock and never release it, until
      all pktgen threads are completed. The above can blocks indefinitely any
      other pktgen command and any (even unrelated) netdevice removal - as
      the pktgen netdev notifier acquires the same lock.
      
      The issue is demonstrated by the following script, reported by Matteo:
      
      ip -b - <<'EOF'
      	link add type dummy
      	link add type veth
      	link set dummy0 up
      EOF
      modprobe pktgen
      echo reset >/proc/net/pktgen/pgctrl
      {
      	echo rem_device_all
      	echo add_device dummy0
      } >/proc/net/pktgen/kpktgend_0
      echo count 0 >/proc/net/pktgen/dummy0
      echo start >/proc/net/pktgen/pgctrl &
      sleep 1
      rmmod veth
      
      Fix the above releasing the thread lock around the sleep call.
      
      Additionally we must prevent racing with forcefull rmmod - as the
      thread lock no more protects from them. Instead, acquire a self-reference
      before waiting for any thread. As a side effect, running
      
      rmmod pktgen
      
      while some thread is running now fails with "module in use" error,
      before this patch such command hanged indefinitely.
      
      Note: the issue predates the commit reported in the fixes tag, but
      this fix can't be applied before the mentioned commit.
      
      v1 -> v2:
       - no need to check for thread existence after flipping the lock,
         pktgen threads are freed only at net exit time
       -
      
      Fixes: 6146e6a4 ("[PKTGEN]: Removes thread_{un,}lock() macros.")
      Reported-and-tested-by: default avatarMatteo Croce <mcroce@redhat.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      720f1de4
    • Linus Torvalds's avatar
      Merge tag 'for-rc-adfs' of git://git.armlinux.org.uk/~rmk/linux-arm · 44e843eb
      Linus Torvalds authored
      Pull ADFS cleanups/fixes from Russell King:
       "As a result of some of Al Viro's great work, here are a few cleanups
        with fixes for adfs:
      
         - factor out filename comparison, so we can be sure that
           adfs_compare() (used for namei compare) and adfs_match() (used for
           lookup) have the same behaviour.
      
         - factor out filename lowering (which is not the same as tolower()
           which will lower top-bit-set characters) to ensure that we have the
           same behaviour when comparing filenames as when we hash them.
      
         - factor out the object fixups, so we are applying all fixups to
           directory objects in the same way, independent of the disk format.
      
         - factor out the object name fixup (into the previously factored out
           function) to ensure that filenames are appropriately translated -
           for example, adfs allows '/' in filenames, which being the Unix
           path separator, need to be translated to a different character,
           which is normally '.' (DOS 8.3 filenames represent the . as a / on
           adfs, so this is the expected reverse translation.)
      
         - remove filename truncation; Al asked about this and apparently the
           decision is to remove it. In any case, adfs's truncation was buggy,
           so this rids us of that bug by removing the truncation feature.
      
         - we now have only one location which adds the "filetype" suffix to
           the filename, so there's no point that code being out of line.
      
         - since we translate '/' into '.', an adfs filename of "/" or "//"
           would end up being translated to "." and ".." which have special
           meanings. In this case, change the first character to "^" to avoid
           these special directory names being abused"
      
      * tag 'for-rc-adfs' of git://git.armlinux.org.uk/~rmk/linux-arm:
        fs/adfs: fix filename fixup handling for "/" and "//" names
        fs/adfs: move append_filetype_suffix() into adfs_object_fixup()
        fs/adfs: remove truncated filename hashing
        fs/adfs: factor out filename fixup
        fs/adfs: factor out object fixups
        fs/adfs: factor out filename case lowering
        fs/adfs: factor out filename comparison
      44e843eb
    • Maxime Chevallier's avatar
      net: mvpp2: Use strscpy to handle stat strings · d37acd5a
      Maxime Chevallier authored
      Use a safe strscpy call to copy the ethtool stat strings into the
      relevant buffers, instead of a memcpy that will be accessing
      out-of-bound data.
      
      Fixes: 118d6298 ("net: mvpp2: add ethtool GOP statistics")
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d37acd5a
    • Zhu Yanjun's avatar
      net: rds: fix memory leak in rds_ib_flush_mr_pool · 85cb9287
      Zhu Yanjun authored
      When the following tests last for several hours, the problem will occur.
      
      Server:
          rds-stress -r 1.1.1.16 -D 1M
      Client:
          rds-stress -r 1.1.1.14 -s 1.1.1.16 -D 1M -T 30
      
      The following will occur.
      
      "
      Starting up....
      tsks   tx/s   rx/s  tx+rx K/s    mbi K/s    mbo K/s tx us/c   rtt us cpu
      %
        1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
        1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
        1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
        1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
      "
      >From vmcore, we can find that clean_list is NULL.
      
      >From the source code, rds_mr_flushd calls rds_ib_mr_pool_flush_worker.
      Then rds_ib_mr_pool_flush_worker calls
      "
       rds_ib_flush_mr_pool(pool, 0, NULL);
      "
      Then in function
      "
      int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *pool,
                               int free_all, struct rds_ib_mr **ibmr_ret)
      "
      ibmr_ret is NULL.
      
      In the source code,
      "
      ...
      list_to_llist_nodes(pool, &unmap_list, &clean_nodes, &clean_tail);
      if (ibmr_ret)
              *ibmr_ret = llist_entry(clean_nodes, struct rds_ib_mr, llnode);
      
      /* more than one entry in llist nodes */
      if (clean_nodes->next)
              llist_add_batch(clean_nodes->next, clean_tail, &pool->clean_list);
      ...
      "
      When ibmr_ret is NULL, llist_entry is not executed. clean_nodes->next
      instead of clean_nodes is added in clean_list.
      So clean_nodes is discarded. It can not be used again.
      The workqueue is executed periodically. So more and more clean_nodes are
      discarded. Finally the clean_list is NULL.
      Then this problem will occur.
      
      Fixes: 1bc144b6 ("net, rds, Replace xlist in net/rds/xlist.h with llist")
      Signed-off-by: default avatarZhu Yanjun <yanjun.zhu@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      85cb9287
    • David S. Miller's avatar
      Merge branch 'ipv6-fix-EFAULT-on-sendto-with-icmpv6-and-hdrincl' · 8d037f92
      David S. Miller authored
      Olivier Matz says:
      
      ====================
      ipv6: fix EFAULT on sendto with icmpv6 and hdrincl
      
      The following code returns EFAULT (Bad address):
      
        s = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6);
        setsockopt(s, SOL_IPV6, IPV6_HDRINCL, 1);
        sendto(ipv6_icmp6_packet, addr);   /* returns -1, errno = EFAULT */
      
      The problem is fixed in the second patch. The first one aligns the
      code to ipv4, to avoid a race condition in the second patch.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d037f92
    • Olivier Matz's avatar
      ipv6: fix EFAULT on sendto with icmpv6 and hdrincl · b9aa52c4
      Olivier Matz authored
      The following code returns EFAULT (Bad address):
      
        s = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6);
        setsockopt(s, SOL_IPV6, IPV6_HDRINCL, 1);
        sendto(ipv6_icmp6_packet, addr);   /* returns -1, errno = EFAULT */
      
      The IPv4 equivalent code works. A workaround is to use IPPROTO_RAW
      instead of IPPROTO_ICMPV6.
      
      The failure happens because 2 bytes are eaten from the msghdr by
      rawv6_probe_proto_opt() starting from commit 19e3c66b ("ipv6
      equivalent of "ipv4: Avoid reading user iov twice after
      raw_probe_proto_opt""), but at that time it was not a problem because
      IPV6_HDRINCL was not yet introduced.
      
      Only eat these 2 bytes if hdrincl == 0.
      
      Fixes: 715f504b ("ipv6: add IPV6_HDRINCL option for raw sockets")
      Signed-off-by: default avatarOlivier Matz <olivier.matz@6wind.com>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9aa52c4
    • Olivier Matz's avatar
      ipv6: use READ_ONCE() for inet->hdrincl as in ipv4 · 59e3e4b5
      Olivier Matz authored
      As it was done in commit 8f659a03 ("net: ipv4: fix for a race
      condition in raw_sendmsg") and commit 20b50d79 ("net: ipv4: emulate
      READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()") for ipv4, copy the
      value of inet->hdrincl in a local variable, to avoid introducing a race
      condition in the next commit.
      Signed-off-by: default avatarOlivier Matz <olivier.matz@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59e3e4b5
    • Bob Peterson's avatar
      Revert "gfs2: Replace gl_revokes with a GLF flag" · 638803d4
      Bob Peterson authored
      Commit 73118ca8 introduced a glock reference counting bug in
      gfs2_trans_remove_revoke.  Given that, replacing gl_revokes with a GLF flag is
      no longer useful, so revert that commit.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      638803d4
    • Dave Martin's avatar
      arm64: Silence gcc warnings about arch ABI drift · ebcc5928
      Dave Martin authored
      Since GCC 9, the compiler warns about evolution of the
      platform-specific ABI, in particular relating for the marshaling of
      certain structures involving bitfields.
      
      The kernel is a standalone binary, and of course nobody would be
      so stupid as to expose structs containing bitfields as function
      arguments in ABI.  (Passing a pointer to such a struct, however
      inadvisable, should be unaffected by this change.  perf and various
      drivers rely on that.)
      
      So these warnings do more harm than good: turn them off.
      
      We may miss warnings about future ABI drift, but that's too bad.
      Future ABI breaks of this class will have to be debugged and fixed
      the traditional way unless the compiler evolves finer-grained
      diagnostics.
      Signed-off-by: default avatarDave Martin <Dave.Martin@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      ebcc5928
    • Helge Deller's avatar
      parisc: Fix crash due alternative coding for NP iopdir_fdc bit · 527a1d1e
      Helge Deller authored
      According to the found documentation, data cache flushes and sync
      instructions are needed on the PCX-U+ (PA8200, e.g. C200/C240)
      platforms, while PCX-W (PA8500, e.g. C360) platforms aparently don't
      need those flushes when changing the IO PDIR data structures.
      
      We have no documentation for PCX-W+ (PA8600) and PCX-W2 (PA8700) CPUs,
      but Carlo Pisani reported that his C3600 machine (PA8600, PCX-W+) fails
      when the fdc instructions were removed. His firmware didn't set the NIOP
      bit, so one may assume it's a firmware bug since other C3750 machines
      had the bit set.
      
      Even if documentation (as mentioned above) states that PCX-W (PA8500,
      e.g.  J5000) does not need fdc flushes, Sven could show that an Adaptec
      29320A PCI-X SCSI controller reliably failed on a dd command during the
      first five minutes in his J5000 when fdc flushes were missing.
      
      Going forward, we will now NOT replace the fdc and sync assembler
      instructions by NOPS if:
      a) the NP iopdir_fdc bit was set by firmware, or
      b) we find a CPU up to and including a PCX-W+ (PA8600).
      
      This fixes the HPMC crashes on a C240 and C36XX machines. For other
      machines we rely on the firmware to set the bit when needed.
      
      In case one finds HPMC issues, people could try to boot their machines
      with the "no-alternatives" kernel option to turn off any alternative
      patching.
      Reported-by: default avatarSven Schnelle <svens@stackframe.org>
      Reported-by: default avatarCarlo Pisani <carlojpisani@gmail.com>
      Tested-by: default avatarSven Schnelle <svens@stackframe.org>
      Fixes: 3847dab7 ("parisc: Add alternative coding infrastructure")
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org # 5.0+
      527a1d1e
    • John David Anglin's avatar
      parisc: Use lpa instruction to load physical addresses in driver code · 116d7533
      John David Anglin authored
      Most I/O in the kernel is done using the kernel offset mapping.
      However, there is one API that uses aliased kernel address ranges:
      
      > The final category of APIs is for I/O to deliberately aliased address
      > ranges inside the kernel.  Such aliases are set up by use of the
      > vmap/vmalloc API.  Since kernel I/O goes via physical pages, the I/O
      > subsystem assumes that the user mapping and kernel offset mapping are
      > the only aliases.  This isn't true for vmap aliases, so anything in
      > the kernel trying to do I/O to vmap areas must manually manage
      > coherency.  It must do this by flushing the vmap range before doing
      > I/O and invalidating it after the I/O returns.
      
      For this reason, we should use the hardware lpa instruction to load the
      physical address of kernel virtual addresses in the driver code.
      
      I believe we only use the vmap/vmalloc API with old PA 1.x processors
      which don't have a sba, so we don't hit this problem.
      
      Tested on c3750, c8000 and rp3440.
      Signed-off-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      116d7533
    • Krzysztof Kozlowski's avatar
      parisc: configs: Remove useless UEVENT_HELPER_PATH · ec13c82d
      Krzysztof Kozlowski authored
      Remove the CONFIG_UEVENT_HELPER_PATH because:
      1. It is disabled since commit 1be01d4a ("driver: base: Disable
         CONFIG_UEVENT_HELPER by default") as its dependency (UEVENT_HELPER) was
         made default to 'n',
      2. It is not recommended (help message: "This should not be used today
         [...] creates a high system load") and was kept only for ancient
         userland,
      3. Certain userland specifically requests it to be disabled (systemd
         README: "Legacy hotplug slows down the system and confuses udev").
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Acked-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      ec13c82d
    • John David Anglin's avatar
      parisc: Use implicit space register selection for loading the coherence index of I/O pdirs · 63923d2c
      John David Anglin authored
      We only support I/O to kernel space. Using %sr1 to load the coherence
      index may be racy unless interrupts are disabled. This patch changes the
      code used to load the coherence index to use implicit space register
      selection. This saves one instruction and eliminates the race.
      
      Tested on rp3440, c8000 and c3750.
      Signed-off-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      63923d2c
    • George G. Davis's avatar
      ARM64: trivial: s/TIF_SECOMP/TIF_SECCOMP/ comment typo fix · 2b55d83e
      George G. Davis authored
      Fix a s/TIF_SECOMP/TIF_SECCOMP/ comment typo
      
      Cc: Jiri Kosina <trivial@kernel.org>
      Reviewed-by: Kees Cook <keescook@chromium.org
      Signed-off-by: default avatarGeorge G. Davis <george_davis@mentor.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      2b55d83e
    • Hangbin Liu's avatar
      Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied" · 4970b42d
      Hangbin Liu authored
      This reverts commit e9919a24.
      
      Nathan reported the new behaviour breaks Android, as Android just add
      new rules and delete old ones.
      
      If we return 0 without adding dup rules, Android will remove the new
      added rules and causing system to soft-reboot.
      
      Fixes: e9919a24 ("fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied")
      Reported-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reported-by: default avatarYaro Slav <yaro330@gmail.com>
      Reported-by: default avatarMaciej Żenczykowski <zenczykowski@gmail.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Tested-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4970b42d
    • Nikita Danilov's avatar
      net: aquantia: fix wol configuration not applied sometimes · 930b9a05
      Nikita Danilov authored
      WoL magic packet configuration sometimes does not work due to
      couple of leakages found.
      
      Mainly there was a regression introduced during readx_poll refactoring.
      
      Next, fw request waiting time was too small. Sometimes that
      caused sleep proxy config function to return with an error
      and to skip WoL configuration.
      At last, WoL data were passed to FW from not clean buffer.
      That could cause FW to accept garbage as a random configuration data.
      
      Fixes: 6a7f2277 ("net: aquantia: replace AQ_HW_WAIT_FOR with readx_poll_timeout_atomic")
      Signed-off-by: default avatarNikita Danilov <nikita.danilov@aquantia.com>
      Signed-off-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      930b9a05
    • Vivien Didelot's avatar
      ethtool: fix potential userspace buffer overflow · 0ee4e769
      Vivien Didelot authored
      ethtool_get_regs() allocates a buffer of size ops->get_regs_len(),
      and pass it to the kernel driver via ops->get_regs() for filling.
      
      There is no restriction about what the kernel drivers can or cannot do
      with the open ethtool_regs structure. They usually set regs->version
      and ignore regs->len or set it to the same size as ops->get_regs_len().
      
      But if userspace allocates a smaller buffer for the registers dump,
      we would cause a userspace buffer overflow in the final copy_to_user()
      call, which uses the regs.len value potentially reset by the driver.
      
      To fix this, make this case obvious and store regs.len before calling
      ops->get_regs(), to only copy as much data as requested by userspace,
      up to the value returned by ops->get_regs_len().
      
      While at it, remove the redundant check for non-null regbuf.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Reviewed-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ee4e769
    • Neil Horman's avatar
      Fix memory leak in sctp_process_init · 0a8dd9f6
      Neil Horman authored
      syzbot found the following leak in sctp_process_init
      BUG: memory leak
      unreferenced object 0xffff88810ef68400 (size 1024):
        comm "syz-executor273", pid 7046, jiffies 4294945598 (age 28.770s)
        hex dump (first 32 bytes):
          1d de 28 8d de 0b 1b e3 b5 c2 f9 68 fd 1a 97 25  ..(........h...%
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000a02cebbd>] kmemleak_alloc_recursive include/linux/kmemleak.h:55
      [inline]
          [<00000000a02cebbd>] slab_post_alloc_hook mm/slab.h:439 [inline]
          [<00000000a02cebbd>] slab_alloc mm/slab.c:3326 [inline]
          [<00000000a02cebbd>] __do_kmalloc mm/slab.c:3658 [inline]
          [<00000000a02cebbd>] __kmalloc_track_caller+0x15d/0x2c0 mm/slab.c:3675
          [<000000009e6245e6>] kmemdup+0x27/0x60 mm/util.c:119
          [<00000000dfdc5d2d>] kmemdup include/linux/string.h:432 [inline]
          [<00000000dfdc5d2d>] sctp_process_init+0xa7e/0xc20
      net/sctp/sm_make_chunk.c:2437
          [<00000000b58b62f8>] sctp_cmd_process_init net/sctp/sm_sideeffect.c:682
      [inline]
          [<00000000b58b62f8>] sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1384
      [inline]
          [<00000000b58b62f8>] sctp_side_effects net/sctp/sm_sideeffect.c:1194
      [inline]
          [<00000000b58b62f8>] sctp_do_sm+0xbdc/0x1d60 net/sctp/sm_sideeffect.c:1165
          [<0000000044e11f96>] sctp_assoc_bh_rcv+0x13c/0x200
      net/sctp/associola.c:1074
          [<00000000ec43804d>] sctp_inq_push+0x7f/0xb0 net/sctp/inqueue.c:95
          [<00000000726aa954>] sctp_backlog_rcv+0x5e/0x2a0 net/sctp/input.c:354
          [<00000000d9e249a8>] sk_backlog_rcv include/net/sock.h:950 [inline]
          [<00000000d9e249a8>] __release_sock+0xab/0x110 net/core/sock.c:2418
          [<00000000acae44fa>] release_sock+0x37/0xd0 net/core/sock.c:2934
          [<00000000963cc9ae>] sctp_sendmsg+0x2c0/0x990 net/sctp/socket.c:2122
          [<00000000a7fc7565>] inet_sendmsg+0x64/0x120 net/ipv4/af_inet.c:802
          [<00000000b732cbd3>] sock_sendmsg_nosec net/socket.c:652 [inline]
          [<00000000b732cbd3>] sock_sendmsg+0x54/0x70 net/socket.c:671
          [<00000000274c57ab>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2292
          [<000000008252aedb>] __sys_sendmsg+0x80/0xf0 net/socket.c:2330
          [<00000000f7bf23d1>] __do_sys_sendmsg net/socket.c:2339 [inline]
          [<00000000f7bf23d1>] __se_sys_sendmsg net/socket.c:2337 [inline]
          [<00000000f7bf23d1>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2337
          [<00000000a8b4131f>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:3
      
      The problem was that the peer.cookie value points to an skb allocated
      area on the first pass through this function, at which point it is
      overwritten with a heap allocated value, but in certain cases, where a
      COOKIE_ECHO chunk is included in the packet, a second pass through
      sctp_process_init is made, where the cookie value is re-allocated,
      leaking the first allocation.
      
      Fix is to always allocate the cookie value, and free it when we are done
      using it.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: syzbot+f7e9153b037eac9b1df8@syzkaller.appspotmail.com
      CC: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: netdev@vger.kernel.org
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a8dd9f6
    • Zhu Yanjun's avatar
      net: rds: fix memory leak when unload rds_rdma · b50e0587
      Zhu Yanjun authored
      When KASAN is enabled, after several rds connections are
      created, then "rmmod rds_rdma" is run. The following will
      appear.
      
      "
      BUG rds_ib_incoming (Not tainted): Objects remaining
      in rds_ib_incoming on __kmem_cache_shutdown()
      
      Call Trace:
       dump_stack+0x71/0xab
       slab_err+0xad/0xd0
       __kmem_cache_shutdown+0x17d/0x370
       shutdown_cache+0x17/0x130
       kmem_cache_destroy+0x1df/0x210
       rds_ib_recv_exit+0x11/0x20 [rds_rdma]
       rds_ib_exit+0x7a/0x90 [rds_rdma]
       __x64_sys_delete_module+0x224/0x2c0
       ? __ia32_sys_delete_module+0x2c0/0x2c0
       do_syscall_64+0x73/0x190
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      "
      This is rds connection memory leak. The root cause is:
      When "rmmod rds_rdma" is run, rds_ib_remove_one will call
      rds_ib_dev_shutdown to drop the rds connections.
      rds_ib_dev_shutdown will call rds_conn_drop to drop rds
      connections as below.
      "
      rds_conn_path_drop(&conn->c_path[0], false);
      "
      In the above, destroy is set to false.
      void rds_conn_path_drop(struct rds_conn_path *cp, bool destroy)
      {
              atomic_set(&cp->cp_state, RDS_CONN_ERROR);
      
              rcu_read_lock();
              if (!destroy && rds_destroy_pending(cp->cp_conn)) {
                      rcu_read_unlock();
                      return;
              }
              queue_work(rds_wq, &cp->cp_down_w);
              rcu_read_unlock();
      }
      In the above function, destroy is set to false. rds_destroy_pending
      is called. This does not move rds connections to ib_nodev_conns.
      So destroy is set to true to move rds connections to ib_nodev_conns.
      In rds_ib_unregister_client, flush_workqueue is called to make rds_wq
      finsh shutdown rds connections. The function rds_ib_destroy_nodev_conns
      is called to shutdown rds connections finally.
      Then rds_ib_recv_exit is called to destroy slab.
      
      void rds_ib_recv_exit(void)
      {
              kmem_cache_destroy(rds_ib_incoming_slab);
              kmem_cache_destroy(rds_ib_frag_slab);
      }
      The above slab memory leak will not occur again.
      
      >From tests,
      256 rds connections
      [root@ca-dev14 ~]# time rmmod rds_rdma
      
      real    0m16.522s
      user    0m0.000s
      sys     0m8.152s
      512 rds connections
      [root@ca-dev14 ~]# time rmmod rds_rdma
      
      real    0m32.054s
      user    0m0.000s
      sys     0m15.568s
      
      To rmmod rds_rdma with 256 rds connections, about 16 seconds are needed.
      And with 512 rds connections, about 32 seconds are needed.
      >From ftrace, when one rds connection is destroyed,
      
      "
       19)               |  rds_conn_destroy [rds]() {
       19)   7.782 us    |    rds_conn_path_drop [rds]();
       15)               |  rds_shutdown_worker [rds]() {
       15)               |    rds_conn_shutdown [rds]() {
       15)   1.651 us    |      rds_send_path_reset [rds]();
       15)   7.195 us    |    }
       15) + 11.434 us   |  }
       19)   2.285 us    |    rds_cong_remove_conn [rds]();
       19) * 24062.76 us |  }
      "
      So if many rds connections will be destroyed, this function
      rds_ib_destroy_nodev_conns uses most of time.
      Suggested-by: default avatarHåkon Bugge <haakon.bugge@oracle.com>
      Signed-off-by: default avatarZhu Yanjun <yanjun.zhu@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b50e0587
    • Xin Long's avatar
      ipv6: fix the check before getting the cookie in rt6_get_cookie · b7999b07
      Xin Long authored
      In Jianlin's testing, netperf was broken with 'Connection reset by peer',
      as the cookie check failed in rt6_check() and ip6_dst_check() always
      returned NULL.
      
      It's caused by Commit 93531c67 ("net/ipv6: separate handling of FIB
      entries from dst based routes"), where the cookie can be got only when
      'c1'(see below) for setting dst_cookie whereas rt6_check() is called
      when !'c1' for checking dst_cookie, as we can see in ip6_dst_check().
      
      Since in ip6_dst_check() both rt6_dst_from_check() (c1) and rt6_check()
      (!c1) will check the 'from' cookie, this patch is to remove the c1 check
      in rt6_get_cookie(), so that the dst_cookie can always be set properly.
      
      c1:
        (rt->rt6i_flags & RTF_PCPU || unlikely(!list_empty(&rt->rt6i_uncached)))
      
      Fixes: 93531c67 ("net/ipv6: separate handling of FIB entries from dst based routes")
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b7999b07
  3. 05 Jun, 2019 1 commit
    • Xin Long's avatar
      ipv4: not do cache for local delivery if bc_forwarding is enabled · 0a90478b
      Xin Long authored
      With the topo:
      
          h1 ---| rp1            |
                |     route  rp3 |--- h3 (192.168.200.1)
          h2 ---| rp2            |
      
      If rp1 bc_forwarding is set while rp2 bc_forwarding is not, after
      doing "ping 192.168.200.255" on h1, then ping 192.168.200.255 on
      h2, and the packets can still be forwared.
      
      This issue was caused by the input route cache. It should only do
      the cache for either bc forwarding or local delivery. Otherwise,
      local delivery can use the route cache for bc forwarding of other
      interfaces.
      
      This patch is to fix it by not doing cache for local delivery if
      all.bc_forwarding is enabled.
      
      Note that we don't fix it by checking route cache local flag after
      rt_cache_valid() in "local_input:" and "ip_mkroute_input", as the
      common route code shouldn't be touched for bc_forwarding.
      
      Fixes: 5cbf777c ("route: add support for directed broadcast forwarding")
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a90478b