1. 05 Mar, 2019 10 commits
  2. 27 Feb, 2019 30 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.104 · 30921fc1
      Greg Kroah-Hartman authored
      30921fc1
    • Russell King's avatar
      net: phylink: avoid resolving link state too early · 418d77ca
      Russell King authored
      commit 87454b6e upstream.
      
      During testing on Armada 388 platforms, it was found with a certain
      module configuration that it was possible to trigger a kernel oops
      during the module load process, caused by the phylink resolver being
      triggered for a currently disabled interface.
      
      This problem was introduced by changing the way the SFP registration
      works, which now can result in the sfp link down notification being
      called during phylink_create().
      
      Fixes: b5bfc21a ("net: sfp: do not probe SFP module before we're attached")
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Sasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      418d77ca
    • Matthias Kaehlcke's avatar
      sched/sysctl: Fix attributes of some extern declarations · 81bafd09
      Matthias Kaehlcke authored
      commit a9903f04 upstream.
      
      The definition of sysctl_sched_migration_cost, sysctl_sched_nr_migrate
      and sysctl_sched_time_avg includes the attribute const_debug. This
      attribute is not part of the extern declaration of these variables in
      include/linux/sched/sysctl.h, while it is in kernel/sched/sched.h,
      and as a result Clang generates warnings like this:
      
        kernel/sched/sched.h:1618:33: warning: section attribute is specified on redeclared variable [-Wsection]
        extern const_debug unsigned int sysctl_sched_time_avg;
                                      ^
        ./include/linux/sched/sysctl.h:42:21: note: previous declaration is here
        extern unsigned int sysctl_sched_time_avg;
      
      The header only declares the variables when CONFIG_SCHED_DEBUG is defined,
      therefore it is not necessary to duplicate the definition of const_debug.
      Instead we can use the attribute __read_mostly, which is the expansion of
      const_debug when CONFIG_SCHED_DEBUG=y is set.
      Signed-off-by: default avatarMatthias Kaehlcke <mka@chromium.org>
      Reviewed-by: default avatarNick Desaulniers <nick.desaulniers@gmail.com>
      Cc: Douglas Anderson <dianders@chromium.org>
      Cc: Guenter Roeck <groeck@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Shile Zhang <shile.zhang@nokia.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20171030180816.170850-1-mka@chromium.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81bafd09
    • Colin Ian King's avatar
      phy: tegra: remove redundant self assignment of 'map' · cc96cdc8
      Colin Ian King authored
      commit a0dd6773 upstream.
      
      The assignment of map to itself is redundant and can be removed.
      Detected with Coccinelle.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarKishon Vijay Abraham I <kishon@ti.com>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc96cdc8
    • Nathan Chancellor's avatar
      pinctrl: max77620: Use define directive for max77620_pinconf_param values · f6de4ca8
      Nathan Chancellor authored
      commit 1f60652d upstream.
      
      Clang warns when one enumerated type is implicitly converted to another:
      
      drivers/pinctrl/pinctrl-max77620.c:56:12: warning: implicit conversion
      from enumeration type 'enum max77620_pinconf_param' to different
      enumeration type 'enum pin_config_param' [-Wenum-conversion]
                      .param = MAX77620_ACTIVE_FPS_SOURCE,
                               ^~~~~~~~~~~~~~~~~~~~~~~~~~
      
      It is expected that pinctrl drivers can extend pin_config_param because
      of the gap between PIN_CONFIG_END and PIN_CONFIG_MAX so this conversion
      isn't an issue. Most drivers that take advantage of this define the
      PIN_CONFIG variables as constants, rather than enumerated values. Do the
      same thing here so that Clang no longer warns.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/139Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6de4ca8
    • Eli Cooper's avatar
      netfilter: ipv6: Don't preserve original oif for loopback address · 8c2183d8
      Eli Cooper authored
      commit 15df03c6 upstream.
      
      Commit 508b0904 ("netfilter: ipv6: Preserve link scope traffic
      original oif") made ip6_route_me_harder() keep the original oif for
      link-local and multicast packets. However, it also affected packets
      for the loopback address because it used rt6_need_strict().
      
      REDIRECT rules in the OUTPUT chain rewrite the destination to loopback
      address; thus its oif should not be preserved. This commit fixes the bug
      that redirected local packets are being dropped. Actually the packet was
      not exactly dropped; Instead it was sent out to the original oif rather
      than lo. When a packet with daddr ::1 is sent to the router, it is
      effectively dropped.
      
      Fixes: 508b0904 ("netfilter: ipv6: Preserve link scope traffic original oif")
      Signed-off-by: default avatarEli Cooper <elicooper@gmx.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8c2183d8
    • Pablo Neira Ayuso's avatar
      netfilter: nft_compat: use-after-free when deleting targets · 7cfbf4ce
      Pablo Neira Ayuso authored
      commit 753c111f upstream.
      
      Fetch pointer to module before target object is released.
      
      Fixes: 29e38801 ("netfilter: nf_tables: fix use-after-free when deleting compat expressions")
      Fixes: 0ca743a5 ("netfilter: nf_tables: add compatibility layer for x_tables")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7cfbf4ce
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: fix flush after rule deletion in the same batch · c92ba882
      Pablo Neira Ayuso authored
      commit 23b7ca4f upstream.
      
      Flush after rule deletion bogusly hits -ENOENT. Skip rules that have
      been already from nft_delrule_by_chain() which is always called from the
      flush path.
      
      Fixes: cf9dc09d ("netfilter: nf_tables: fix missing rules flushing per table")
      Reported-by: default avatarPhil Sutter <phil@nwl.cc>
      Acked-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c92ba882
    • Hangbin Liu's avatar
      Revert "bridge: do not add port to router list when receives query with source 0.0.0.0" · 9ac5e650
      Hangbin Liu authored
      commit 278e2148 upstream.
      
      This reverts commit 5a2de63f ("bridge: do not add port to router list
      when receives query with source 0.0.0.0") and commit 0fe5119e ("net:
      bridge: remove ipv6 zero address check in mcast queries")
      
      The reason is RFC 4541 is not a standard but suggestive. Currently we
      will elect 0.0.0.0 as Querier if there is no ip address configured on
      bridge. If we do not add the port which recives query with source
      0.0.0.0 to router list, the IGMP reports will not be about to forward
      to Querier, IGMP data will also not be able to forward to dest.
      
      As Nikolay suggested, revert this change first and add a boolopt api
      to disable none-zero election in future if needed.
      Reported-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Reported-by: default avatarSebastian Gottschall <s.gottschall@newmedia-net.de>
      Fixes: 5a2de63f ("bridge: do not add port to router list when receives query with source 0.0.0.0")
      Fixes: 0fe5119e ("net: bridge: remove ipv6 zero address check in mcast queries")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ac5e650
    • Willem de Bruijn's avatar
      net: avoid false positives in untrusted gso validation · d996573e
      Willem de Bruijn authored
      commit 9e8db591 upstream.
      
      GSO packets with vnet_hdr must conform to a small set of gso_types.
      The below commit uses flow dissection to drop packets that do not.
      
      But it has false positives when the skb is not fully initialized.
      Dissection needs skb->protocol and skb->network_header.
      
      Infer skb->protocol from gso_type as the two must agree.
      SKB_GSO_UDP can use both ipv4 and ipv6, so try both.
      
      Exclude callers for which network header offset is not known.
      
      Fixes: d5be7f63 ("net: validate untrusted gso packets without csum offload")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d996573e
    • Willem de Bruijn's avatar
      net: validate untrusted gso packets without csum offload · dac7d443
      Willem de Bruijn authored
      commit d5be7f63 upstream.
      
      Syzkaller again found a path to a kernel crash through bad gso input.
      By building an excessively large packet to cause an skb field to wrap.
      
      If VIRTIO_NET_HDR_F_NEEDS_CSUM was set this would have been dropped in
      skb_partial_csum_set.
      
      GSO packets that do not set checksum offload are suspicious and rare.
      Most callers of virtio_net_hdr_to_skb already pass them to
      skb_probe_transport_header.
      
      Move that test forward, change it to detect parse failure and drop
      packets on failure as those cleary are not one of the legitimate
      VIRTIO_NET_HDR_GSO types.
      
      Fixes: bfd5f4a3 ("packet: Add GSO/csum offload support.")
      Fixes: f43798c2 ("tun: Allow GSO using virtio_net_hdr")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dac7d443
    • Chris Wilson's avatar
      drm/i915/fbdev: Actually configure untiled displays · 6964c1d5
      Chris Wilson authored
      commit d179b88d upstream.
      
      If we skipped all the connectors that were not part of a tile, we would
      leave conn_seq=0 and conn_configured=0, convincing ourselves that we
      had stagnated in our configuration attempts. Avoid this situation by
      starting conn_seq=ALL_CONNECTORS, and repeating until we find no more
      connectors to configure.
      
      Fixes: 754a7659 ("drm/i915/fbdev: Stop repeating tile configuration on stagnation")
      Reported-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190215123019.32283-1-chris@chris-wilson.co.uk
      Cc: <stable@vger.kernel.org> # v3.19+
      (cherry picked from commit d9b308b1)
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6964c1d5
    • Alexey Brodkin's avatar
      ARC: define ARCH_SLAB_MINALIGN = 8 · 238209c6
      Alexey Brodkin authored
      commit b6835ea7 upstream.
      
      The default value of ARCH_SLAB_MINALIGN in "include/linux/slab.h" is
      "__alignof__(unsigned long long)" which for ARC unexpectedly turns out
      to be 4. This is not a compiler bug, but as defined by ARC ABI [1]
      
      Thus slab allocator would allocate a struct which is 32-bit aligned,
      which is generally OK even if struct has long long members.
      There was however potetial problem when it had any atomic64_t which
      use LLOCKD/SCONDD instructions which are required by ISA to take
      64-bit addresses. This is the problem we ran into
      
      [    4.015732] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
      [    4.167881] Misaligned Access
      [    4.172356] Path: /bin/busybox.nosuid
      [    4.176004] CPU: 2 PID: 171 Comm: rm Not tainted 4.19.14-yocto-standard #1
      [    4.182851]
      [    4.182851] [ECR   ]: 0x000d0000 => Check Programmer's Manual
      [    4.190061] [EFA   ]: 0xbeaec3fc
      [    4.190061] [BLINK ]: ext4_delete_entry+0x210/0x234
      [    4.190061] [ERET  ]: ext4_delete_entry+0x13e/0x234
      [    4.202985] [STAT32]: 0x80080002 : IE K
      [    4.207236] BTA: 0x9009329c   SP: 0xbe5b1ec4  FP: 0x00000000
      [    4.212790] LPS: 0x9074b118  LPE: 0x9074b120 LPC: 0x00000000
      [    4.218348] r00: 0x00000040  r01: 0x00000021 r02: 0x00000001
      ...
      ...
      [    4.270510] Stack Trace:
      [    4.274510]   ext4_delete_entry+0x13e/0x234
      [    4.278695]   ext4_rmdir+0xe0/0x238
      [    4.282187]   vfs_rmdir+0x50/0xf0
      [    4.285492]   do_rmdir+0x9e/0x154
      [    4.288802]   EV_Trap+0x110/0x114
      
      The fix is to make sure slab allocations are 64-bit aligned.
      
      Do note that atomic64_t is __attribute__((aligned(8)) which means gcc
      does generate 64-bit aligned references, relative to beginning of
      container struct. However the issue is if the container itself is not
      64-bit aligned, atomic64_t ends up unaligned which is what this patch
      ensures.
      
      [1] https://github.com/foss-for-synopsys-dwc-arc-processors/toolchain/wiki/files/ARCv2_ABI.pdfSigned-off-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      Cc: <stable@vger.kernel.org> # 4.8+
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      [vgupta: reworked changelog, added dependency on LL64+LLSC]
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      238209c6
    • Eugeniy Paltsev's avatar
      ARC: U-boot: check arguments paranoidly · e7264579
      Eugeniy Paltsev authored
      commit a66f2e57 upstream.
      
      Handle U-boot arguments paranoidly:
       * don't allow to pass unknown tag.
       * try to use external device tree blob only if corresponding tag
         (TAG_DTB) is set.
       * don't check uboot_tag if kernel build with no ARC_UBOOT_SUPPORT.
      
      NOTE:
      If U-boot args are invalid we skip them and try to use embedded device
      tree blob. We can't panic on invalid U-boot args as we really pass
      invalid args due to bug in U-boot code.
      This happens if we don't provide external DTB to U-boot and
      don't set 'bootargs' U-boot environment variable (which is default
      case at least for HSDK board) In that case we will pass
      {r0 = 1 (bootargs in r2); r1 = 0; r2 = 0;} to linux which is invalid.
      
      While I'm at it refactor U-boot arguments handling code.
      
      Cc: stable@vger.kernel.org
      Tested-by: default avatarCorentin LABBE <clabbe@baylibre.com>
      Signed-off-by: default avatarEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7264579
    • Eugeniy Paltsev's avatar
      ARCv2: Enable unaligned access in early ASM code · 1f448141
      Eugeniy Paltsev authored
      commit 252f6e8e upstream.
      
      It is currently done in arc_init_IRQ() which might be too late
      considering gcc 7.3.1 onwards (GNU 2018.03) generates unaligned
      memory accesses by default
      
      Cc: stable@vger.kernel.org #4.4+
      Signed-off-by: default avatarEugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      [vgupta: rewrote changelog]
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f448141
    • Dmitry V. Levin's avatar
      parisc: Fix ptrace syscall number modification · bc423b65
      Dmitry V. Levin authored
      commit b7dc5a07 upstream.
      
      Commit 910cd32e ("parisc: Fix and enable seccomp filter support")
      introduced a regression in ptrace-based syscall tampering: when tracer
      changes syscall number to -1, the kernel fails to initialize %r28 with
      -ENOSYS and subsequently fails to return the error code of the failed
      syscall to userspace.
      
      This erroneous behaviour could be observed with a simple strace syscall
      fault injection command which is expected to print something like this:
      
      $ strace -a0 -ewrite -einject=write:error=enospc echo hello
      write(1, "hello\n", 6) = -1 ENOSPC (No space left on device) (INJECTED)
      write(2, "echo: ", 6) = -1 ENOSPC (No space left on device) (INJECTED)
      write(2, "write error", 11) = -1 ENOSPC (No space left on device) (INJECTED)
      write(2, "\n", 1) = -1 ENOSPC (No space left on device) (INJECTED)
      +++ exited with 1 +++
      
      After commit 910cd32e it loops printing
      something like this instead:
      
      write(1, "hello\n", 6../strace: Failed to tamper with process 12345: unexpectedly got no error (return value 0, error 0)
      ) = 0 (INJECTED)
      
      This bug was found by strace test suite.
      
      Fixes: 910cd32e ("parisc: Fix and enable seccomp filter support")
      Cc: stable@vger.kernel.org # v4.5+
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Tested-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc423b65
    • Eric Biggers's avatar
      KEYS: always initialize keyring_index_key::desc_len · 50d039d9
      Eric Biggers authored
      commit ede0fa98 upstream.
      
      syzbot hit the 'BUG_ON(index_key->desc_len == 0);' in __key_link_begin()
      called from construct_alloc_key() during sys_request_key(), because the
      length of the key description was never calculated.
      
      The problem is that we rely on ->desc_len being initialized by
      search_process_keyrings(), specifically by search_nested_keyrings().
      But, if the process isn't subscribed to any keyrings that never happens.
      
      Fix it by always initializing keyring_index_key::desc_len as soon as the
      description is set, like we already do in some places.
      
      The following program reproduces the BUG_ON() when it's run as root and
      no session keyring has been installed.  If it doesn't work, try removing
      pam_keyinit.so from /etc/pam.d/login and rebooting.
      
          #include <stdlib.h>
          #include <unistd.h>
          #include <keyutils.h>
      
          int main(void)
          {
                  int id = add_key("keyring", "syz", NULL, 0, KEY_SPEC_USER_KEYRING);
      
                  keyctl_setperm(id, KEY_OTH_WRITE);
                  setreuid(5000, 5000);
                  request_key("user", "desc", "", id);
          }
      
      Reported-by: syzbot+ec24e95ea483de0a24da@syzkaller.appspotmail.com
      Fixes: b2a4df20 ("KEYS: Expand the capacity of a keyring")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJames Morris <james.morris@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50d039d9
    • Eric Biggers's avatar
      KEYS: user: Align the payload buffer · 56a682bd
      Eric Biggers authored
      commit cc1780fc upstream.
      
      Align the payload of "user" and "logon" keys so that users of the
      keyrings service can access it as a struct that requires more than
      2-byte alignment.  fscrypt currently does this which results in the read
      of fscrypt_key::size being misaligned as it needs 4-byte alignment.
      
      Align to __alignof__(u64) rather than __alignof__(long) since in the
      future it's conceivable that people would use structs beginning with
      u64, which on some platforms would require more than 'long' alignment.
      Reported-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Fixes: 2aa349f6 ("[PATCH] Keys: Export user-defined keyring operations")
      Fixes: 88bd6ccd ("ext4 crypto: add encryption key management facilities")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Tested-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarJames Morris <james.morris@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56a682bd
    • Bart Van Assche's avatar
      RDMA/srp: Rework SCSI device reset handling · 4040907e
      Bart Van Assche authored
      commit 48396e80 upstream.
      
      Since .scsi_done() must only be called after scsi_queue_rq() has
      finished, make sure that the SRP initiator driver does not call
      .scsi_done() while scsi_queue_rq() is in progress. Although
      invoking sg_reset -d while I/O is in progress works fine with kernel
      v4.20 and before, that is not the case with kernel v5.0-rc1. This
      patch avoids that the following crash is triggered with kernel
      v5.0-rc1:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000138
      CPU: 0 PID: 360 Comm: kworker/0:1H Tainted: G    B             5.0.0-rc1-dbg+ #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
      Workqueue: kblockd blk_mq_run_work_fn
      RIP: 0010:blk_mq_dispatch_rq_list+0x116/0xb10
      Call Trace:
       blk_mq_sched_dispatch_requests+0x2f7/0x300
       __blk_mq_run_hw_queue+0xd6/0x180
       blk_mq_run_work_fn+0x27/0x30
       process_one_work+0x4f1/0xa20
       worker_thread+0x67/0x5b0
       kthread+0x1cf/0x1f0
       ret_from_fork+0x24/0x30
      
      Cc: <stable@vger.kernel.org>
      Fixes: 94a9174c ("IB/srp: reduce lock coverage of command completion")
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4040907e
    • Konstantin Khlebnikov's avatar
      inet_diag: fix reporting cgroup classid and fallback to priority · 3daca16b
      Konstantin Khlebnikov authored
      [ Upstream commit 1ec17dbd ]
      
      Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested
      extensions has only 8 bits. Thus extensions starting from DCTCPINFO
      cannot be requested directly. Some of them included into response
      unconditionally or hook into some of lower 8 bits.
      
      Extension INET_DIAG_CLASS_ID has not way to request from the beginning.
      
      This patch bundle it with INET_DIAG_TCLASS (ipv6 tos), fixes space
      reservation, and documents behavior for other extensions.
      
      Also this patch adds fallback to reporting socket priority. This filed
      is more widely used for traffic classification because ipv4 sockets
      automatically maps TOS to priority and default qdisc pfifo_fast knows
      about that. But priority could be changed via setsockopt SO_PRIORITY so
      INET_DIAG_TOS isn't enough for predicting class.
      
      Also cgroup2 obsoletes net_cls classid (it always zero), but we cannot
      reuse this field for reporting cgroup2 id because it is 64-bit (ino+gen).
      
      So, after this patch INET_DIAG_CLASS_ID will report socket priority
      for most common setup when net_cls isn't set and/or cgroup2 in use.
      
      Fixes: 0888e372 ("net: inet: diag: expose sockets cgroup classid")
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3daca16b
    • Saeed Mahameed's avatar
      net/mlx4_en: Force CHECKSUM_NONE for short ethernet frames · fde4151c
      Saeed Mahameed authored
      [ Upstream commit 29dded89 ]
      
      When an ethernet frame is padded to meet the minimum ethernet frame
      size, the padding octets are not covered by the hardware checksum.
      Fortunately the padding octets are usually zero's, which don't affect
      checksum. However, it is not guaranteed. For example, switches might
      choose to make other use of these octets.
      This repeatedly causes kernel hardware checksum fault.
      
      Prior to the cited commit below, skb checksum was forced to be
      CHECKSUM_NONE when padding is detected. After it, we need to keep
      skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
      verify and parse IP headers, it does not worth the effort as the packets
      are so small that CHECKSUM_COMPLETE has no significant advantage.
      
      Future work: when reporting checksum complete is not an option for
      IP non-TCP/UDP packets, we can actually fallback to report checksum
      unnecessary, by looking at cqe IPOK bit.
      
      Fixes: 88078d98 ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends")
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fde4151c
    • Hangbin Liu's avatar
      sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach() · 67b46230
      Hangbin Liu authored
      [ Upstream commit 173656ac ]
      
      If we disabled IPv6 from the kernel command line (ipv6.disable=1), we should
      not call ip6_err_gen_icmpv6_unreach(). This:
      
        ip link add sit1 type sit local 192.0.2.1 remote 192.0.2.2 ttl 1
        ip link set sit1 up
        ip addr add 198.51.100.1/24 dev sit1
        ping 198.51.100.2
      
      if IPv6 is disabled at boot time, will crash the kernel.
      
      v2: there's no need to use in6_dev_get(), use __in6_dev_get() instead,
          as we only need to check that idev exists and we are under
          rcu_read_lock() (from netif_receive_skb_internal()).
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Fixes: ca15a078 ("sit: generate icmpv6 error when receiving icmpv4 error")
      Cc: Oussama Ghorbel <ghorbel@pivasoftware.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      67b46230
    • Cong Wang's avatar
      team: avoid complex list operations in team_nl_cmd_options_set() · 90dbd485
      Cong Wang authored
      [ Upstream commit 2fdeee25 ]
      
      The current opt_inst_list operations inside team_nl_cmd_options_set()
      is too complex to track:
      
          LIST_HEAD(opt_inst_list);
          nla_for_each_nested(...) {
              list_for_each_entry(opt_inst, &team->option_inst_list, list) {
                  if (__team_option_inst_tmp_find(&opt_inst_list, opt_inst))
                      continue;
                  list_add(&opt_inst->tmp_list, &opt_inst_list);
              }
          }
          team_nl_send_event_options_get(team, &opt_inst_list);
      
      as while we retrieve 'opt_inst' from team->option_inst_list, it could
      be added to the local 'opt_inst_list' for multiple times. The
      __team_option_inst_tmp_find() doesn't work, as the setter
      team_mode_option_set() still calls team->ops.exit() which uses
      ->tmp_list too in __team_options_change_check().
      
      Simplify the list operations by moving the 'opt_inst_list' and
      team_nl_send_event_options_get() into the nla_for_each_nested() loop so
      that it can be guranteed that we won't insert a same list entry for
      multiple times. Therefore, __team_option_inst_tmp_find() can be removed
      too.
      
      Fixes: 4fb0534f ("team: avoid adding twice the same option to the event list")
      Fixes: 2fcdb2c9 ("team: allow to send multiple set events in one message")
      Reported-by: syzbot+4d4af685432dc0e56c91@syzkaller.appspotmail.com
      Reported-by: syzbot+68ee510075cf64260cc4@syzkaller.appspotmail.com
      Cc: Jiri Pirko <jiri@resnulli.us>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      90dbd485
    • Xin Long's avatar
      sctp: call gso_reset_checksum when computing checksum in sctp_gso_segment · 77278f05
      Xin Long authored
      [ Upstream commit fc228abc ]
      
      Jianlin reported a panic when running sctp gso over gre over vlan device:
      
        [   84.772930] RIP: 0010:do_csum+0x6d/0x170
        [   84.790605] Call Trace:
        [   84.791054]  csum_partial+0xd/0x20
        [   84.791657]  gre_gso_segment+0x2c3/0x390
        [   84.792364]  inet_gso_segment+0x161/0x3e0
        [   84.793071]  skb_mac_gso_segment+0xb8/0x120
        [   84.793846]  __skb_gso_segment+0x7e/0x180
        [   84.794581]  validate_xmit_skb+0x141/0x2e0
        [   84.795297]  __dev_queue_xmit+0x258/0x8f0
        [   84.795949]  ? eth_header+0x26/0xc0
        [   84.796581]  ip_finish_output2+0x196/0x430
        [   84.797295]  ? skb_gso_validate_network_len+0x11/0x80
        [   84.798183]  ? ip_finish_output+0x169/0x270
        [   84.798875]  ip_output+0x6c/0xe0
        [   84.799413]  ? ip_append_data.part.50+0xc0/0xc0
        [   84.800145]  iptunnel_xmit+0x144/0x1c0
        [   84.800814]  ip_tunnel_xmit+0x62d/0x930 [ip_tunnel]
        [   84.801699]  gre_tap_xmit+0xac/0xf0 [ip_gre]
        [   84.802395]  dev_hard_start_xmit+0xa5/0x210
        [   84.803086]  sch_direct_xmit+0x14f/0x340
        [   84.803733]  __dev_queue_xmit+0x799/0x8f0
        [   84.804472]  ip_finish_output2+0x2e0/0x430
        [   84.805255]  ? skb_gso_validate_network_len+0x11/0x80
        [   84.806154]  ip_output+0x6c/0xe0
        [   84.806721]  ? ip_append_data.part.50+0xc0/0xc0
        [   84.807516]  sctp_packet_transmit+0x716/0xa10 [sctp]
        [   84.808337]  sctp_outq_flush+0xd7/0x880 [sctp]
      
      It was caused by SKB_GSO_CB(skb)->csum_start not set in sctp_gso_segment.
      sctp_gso_segment() calls skb_segment() with 'feature | NETIF_F_HW_CSUM',
      which causes SKB_GSO_CB(skb)->csum_start not to be set in skb_segment().
      
      For TCP/UDP, when feature supports HW_CSUM, CHECKSUM_PARTIAL will be set
      and gso_reset_checksum will be called to set SKB_GSO_CB(skb)->csum_start.
      
      So SCTP should do the same as TCP/UDP, to call gso_reset_checksum() when
      computing checksum in sctp_gso_segment.
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77278f05
    • Russell King's avatar
      net: sfp: do not probe SFP module before we're attached · c4ba68b8
      Russell King authored
      [ Upstream commit b5bfc21a ]
      
      When we probe a SFP module, we expect to be able to call the upstream
      device's module_insert() function so that the upstream link can be
      configured.  However, when the upstream device is delayed, we currently
      may end up probing the module before the upstream device is available,
      and lose the module_insert() call.
      
      Avoid this by holding off probing the module until the SFP bus is
      properly connected to both the SFP socket driver and the upstream
      driver.
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c4ba68b8
    • Kal Conley's avatar
      net/packet: fix 4gb buffer limit due to overflow check · 2226f959
      Kal Conley authored
      [ Upstream commit fc62814d ]
      
      When calculating rb->frames_per_block * req->tp_block_nr the result
      can overflow. Check it for overflow without limiting the total buffer
      size to UINT_MAX.
      
      This change fixes support for packet ring buffers >= UINT_MAX.
      
      Fixes: 8f8d28e4 ("net/packet: fix overflow in check for tp_frame_nr")
      Signed-off-by: default avatarKal Conley <kal.conley@dectris.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2226f959
    • Tonghao Zhang's avatar
      net/mlx5e: Don't overwrite pedit action when multiple pedit used · 7c3969ff
      Tonghao Zhang authored
      [ Upstream commit 218d05ce ]
      
      In some case, we may use multiple pedit actions to modify packets.
      The command shown as below: the last pedit action is effective.
      
      $ tc filter add dev netdev_rep parent ffff: protocol ip prio 1    \
      	flower skip_sw ip_proto icmp dst_ip 3.3.3.3        \
      	action pedit ex munge ip dst set 192.168.1.100 pipe    \
      	action pedit ex munge eth src set 00:00:00:00:00:01 pipe    \
      	action pedit ex munge eth dst set 00:00:00:00:00:02 pipe    \
      	action csum ip pipe    \
      	action tunnel_key set src_ip 1.1.1.100 dst_ip 1.1.1.200 dst_port 4789 id 100 \
      	action mirred egress redirect dev vxlan0
      
      To fix it, we add max_mod_hdr_actions to mlx5e_tc_flow_parse_attr struction,
      max_mod_hdr_actions will store the max pedit action number we support and
      num_mod_hdr_actions indicates how many pedit action we used, and store all
      pedit action to mod_hdr_actions.
      
      Fixes: d79b6df6 ("net/mlx5e: Add parsing of TC pedit actions to HW format")
      Cc: Or Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Reviewed-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Acked-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c3969ff
    • Li RongQing's avatar
      ipv6: propagate genlmsg_reply return code · bb506ddb
      Li RongQing authored
      [ Upstream commit d1f20798 ]
      
      genlmsg_reply can fail, so propagate its return code
      
      Fixes: 915d7e5e ("ipv6: sr: add code base for control plane support of SR-IPv6")
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bb506ddb
    • Eric Dumazet's avatar
      batman-adv: fix uninit-value in batadv_interface_tx() · f08f5424
      Eric Dumazet authored
      [ Upstream commit 4ffcbfac ]
      
      KMSAN reported batadv_interface_tx() was possibly using a
      garbage value [1]
      
      batadv_get_vid() does have a pskb_may_pull() call
      but batadv_interface_tx() does not actually make sure
      this did not fail.
      
      [1]
      BUG: KMSAN: uninit-value in batadv_interface_tx+0x908/0x1e40 net/batman-adv/soft-interface.c:231
      CPU: 0 PID: 10006 Comm: syz-executor469 Not tainted 4.20.0-rc7+ #5
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x173/0x1d0 lib/dump_stack.c:113
       kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
       __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313
       batadv_interface_tx+0x908/0x1e40 net/batman-adv/soft-interface.c:231
       __netdev_start_xmit include/linux/netdevice.h:4356 [inline]
       netdev_start_xmit include/linux/netdevice.h:4365 [inline]
       xmit_one net/core/dev.c:3257 [inline]
       dev_hard_start_xmit+0x607/0xc40 net/core/dev.c:3273
       __dev_queue_xmit+0x2e42/0x3bc0 net/core/dev.c:3843
       dev_queue_xmit+0x4b/0x60 net/core/dev.c:3876
       packet_snd net/packet/af_packet.c:2928 [inline]
       packet_sendmsg+0x8306/0x8f30 net/packet/af_packet.c:2953
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg net/socket.c:631 [inline]
       __sys_sendto+0x8c4/0xac0 net/socket.c:1788
       __do_sys_sendto net/socket.c:1800 [inline]
       __se_sys_sendto+0x107/0x130 net/socket.c:1796
       __x64_sys_sendto+0x6e/0x90 net/socket.c:1796
       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      RIP: 0033:0x441889
      Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 bb 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007ffdda6fd468 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 0000000000441889
      RDX: 000000000000000e RSI: 00000000200000c0 RDI: 0000000000000003
      RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000216 R12: 00007ffdda6fd4c0
      R13: 00007ffdda6fd4b0 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
       kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158
       kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
       kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
       slab_post_alloc_hook mm/slab.h:446 [inline]
       slab_alloc_node mm/slub.c:2759 [inline]
       __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383
       __kmalloc_reserve net/core/skbuff.c:137 [inline]
       __alloc_skb+0x309/0xa20 net/core/skbuff.c:205
       alloc_skb include/linux/skbuff.h:998 [inline]
       alloc_skb_with_frags+0x1c7/0xac0 net/core/skbuff.c:5220
       sock_alloc_send_pskb+0xafd/0x10e0 net/core/sock.c:2083
       packet_alloc_skb net/packet/af_packet.c:2781 [inline]
       packet_snd net/packet/af_packet.c:2872 [inline]
       packet_sendmsg+0x661a/0x8f30 net/packet/af_packet.c:2953
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg net/socket.c:631 [inline]
       __sys_sendto+0x8c4/0xac0 net/socket.c:1788
       __do_sys_sendto net/socket.c:1800 [inline]
       __se_sys_sendto+0x107/0x130 net/socket.c:1796
       __x64_sys_sendto+0x6e/0x90 net/socket.c:1796
       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      
      Fixes: c6c8fea2 ("net: Add batman-adv meshing protocol")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc:	Marek Lindner <mareklindner@neomailbox.ch>
      Cc:	Simon Wunderlich <sw@simonwunderlich.de>
      Cc:	Antonio Quartulli <a@unstable.cc>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f08f5424
    • Nathan Chancellor's avatar
      isdn: avm: Fix string plus integer warning from Clang · 3cbc51d6
      Nathan Chancellor authored
      [ Upstream commit 7afa81c5 ]
      
      A recent commit in Clang expanded the -Wstring-plus-int warning, showing
      some odd behavior in this file.
      
      drivers/isdn/hardware/avm/b1.c:426:30: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
                      cinfo->version[j] = "\0\0" + 1;
                                          ~~~~~~~^~~
      drivers/isdn/hardware/avm/b1.c:426:30: note: use array indexing to silence this warning
                      cinfo->version[j] = "\0\0" + 1;
                                                 ^
                                          &      [  ]
      1 warning generated.
      
      This is equivalent to just "\0". Nick pointed out that it is smarter to
      use "" instead of "\0" because "" is used elsewhere in the kernel and
      can be deduplicated at the linking stage.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/309Suggested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3cbc51d6