1. 17 Mar, 2020 1 commit
  2. 16 Mar, 2020 23 commits
  3. 15 Mar, 2020 8 commits
    • Florian Westphal's avatar
      geneve: move debug check after netdev unregister · 0fda7600
      Florian Westphal authored
      The debug check must be done after unregister_netdevice_many() call --
      the list_del() for this is done inside .ndo_stop.
      
      Fixes: 2843a253 ("geneve: speedup geneve tunnels dismantle")
      Reported-and-tested-by: <syzbot+68a8ed58e3d17c700de5@syzkaller.appspotmail.com>
      Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fda7600
    • Willem de Bruijn's avatar
      net/packet: tpacket_rcv: avoid a producer race condition · 61fad681
      Willem de Bruijn authored
      PACKET_RX_RING can cause multiple writers to access the same slot if a
      fast writer wraps the ring while a slow writer is still copying. This
      is particularly likely with few, large, slots (e.g., GSO packets).
      
      Synchronize kernel thread ownership of rx ring slots with a bitmap.
      
      Writers acquire a slot race-free by testing tp_status TP_STATUS_KERNEL
      while holding the sk receive queue lock. They release this lock before
      copying and set tp_status to TP_STATUS_USER to release to userspace
      when done. During copying, another writer may take the lock, also see
      TP_STATUS_KERNEL, and start writing to the same slot.
      
      Introduce a new rx_owner_map bitmap with a bit per slot. To acquire a
      slot, test and set with the lock held. To release race-free, update
      tp_status and owner bit as a transaction, so take the lock again.
      
      This is the one of a variety of discussed options (see Link below):
      
      * instead of a shadow ring, embed the data in the slot itself, such as
      in tp_padding. But any test for this field may match a value left by
      userspace, causing deadlock.
      
      * avoid the lock on release. This leaves a small race if releasing the
      shadow slot before setting TP_STATUS_USER. The below reproducer showed
      that this race is not academic. If releasing the slot after tp_status,
      the race is more subtle. See the first link for details.
      
      * add a new tp_status TP_KERNEL_OWNED to avoid the transactional store
      of two fields. But, legacy applications may interpret all non-zero
      tp_status as owned by the user. As libpcap does. So this is possible
      only opt-in by newer processes. It can be added as an optional mode.
      
      * embed the struct at the tail of pg_vec to avoid extra allocation.
      The implementation proved no less complex than a separate field.
      
      The additional locking cost on release adds contention, no different
      than scaling on multicore or multiqueue h/w. In practice, below
      reproducer nor small packet tcpdump showed a noticeable change in
      perf report in cycles spent in spinlock. Where contention is
      problematic, packet sockets support mitigation through PACKET_FANOUT.
      And we can consider adding opt-in state TP_KERNEL_OWNED.
      
      Easy to reproduce by running multiple netperf or similar TCP_STREAM
      flows concurrently with `tcpdump -B 129 -n greater 60000`.
      
      Based on an earlier patchset by Jon Rosen. See links below.
      
      I believe this issue goes back to the introduction of tpacket_rcv,
      which predates git history.
      
      Link: https://www.mail-archive.com/netdev@vger.kernel.org/msg237222.htmlSuggested-by: default avatarJon Rosen <jrosen@cisco.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarJon Rosen <jrosen@cisco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61fad681
    • Petr Machata's avatar
      net: ip_gre: Separate ERSPAN newlink / changelink callbacks · e1f8f78f
      Petr Machata authored
      ERSPAN shares most of the code path with GRE and gretap code. While that
      helps keep the code compact, it is also error prone. Currently a broken
      userspace can turn a gretap tunnel into a de facto ERSPAN one by passing
      IFLA_GRE_ERSPAN_VER. There has been a similar issue in ip6gretap in the
      past.
      
      To prevent these problems in future, split the newlink and changelink code
      paths. Split the ERSPAN code out of ipgre_netlink_parms() into a new
      function erspan_netlink_parms(). Extract a piece of common logic from
      ipgre_newlink() and ipgre_changelink() into ipgre_newlink_encap_setup().
      Add erspan_newlink() and erspan_changelink().
      
      Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1f8f78f
    • Shahjada Abul Husain's avatar
      cxgb4: fix delete filter entry fail in unload path · 46ea929b
      Shahjada Abul Husain authored
      Currently, the hardware TID index is assumed to start from index 0.
      However, with the following changeset,
      
      commit c2193999 ("cxgb4: add support for high priority filters")
      
      hardware TID index can start after the high priority region, which
      has introduced a regression resulting in remove filters entry
      failure for cxgb4 unload path. This patch fix that.
      
      Fixes: c2193999 ("cxgb4: add support for high priority filters")
      Signed-off-by: default avatarShahjada Abul Husain <shahjada@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46ea929b
    • Markus Fuchs's avatar
      net: stmmac: platform: Fix misleading interrupt error msg · fc191af1
      Markus Fuchs authored
      Not every stmmac based platform makes use of the eth_wake_irq or eth_lpi
      interrupts. Use the platform_get_irq_byname_optional variant for these
      interrupts, so no error message is displayed, if they can't be found.
      Rather print an information to hint something might be wrong to assist
      debugging on platforms which use these interrupts.
      Signed-off-by: default avatarMarkus Fuchs <mklntf@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc191af1
    • Bruno Meneguele's avatar
      net/bpfilter: fix dprintf usage for /dev/kmsg · 13d0f7b8
      Bruno Meneguele authored
      The bpfilter UMH code was recently changed to log its informative messages to
      /dev/kmsg, however this interface doesn't support SEEK_CUR yet, used by
      dprintf(). As result dprintf() returns -EINVAL and doesn't log anything.
      
      However there already had some discussions about supporting SEEK_CUR into
      /dev/kmsg interface in the past it wasn't concluded. Since the only user of
      that from userspace perspective inside the kernel is the bpfilter UMH
      (userspace) module it's better to correct it here instead waiting a conclusion
      on the interface.
      
      Fixes: 36c4357c ("net: bpfilter: print umh messages to /dev/kmsg")
      Signed-off-by: default avatarBruno Meneguele <bmeneg@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13d0f7b8
    • Cong Wang's avatar
      net_sched: keep alloc_hash updated after hash allocation · 0d1c3530
      Cong Wang authored
      In commit 599be01e ("net_sched: fix an OOB access in cls_tcindex")
      I moved cp->hash calculation before the first
      tcindex_alloc_perfect_hash(), but cp->alloc_hash is left untouched.
      This difference could lead to another out of bound access.
      
      cp->alloc_hash should always be the size allocated, we should
      update it after this tcindex_alloc_perfect_hash().
      
      Reported-and-tested-by: syzbot+dcc34d54d68ef7d2d53d@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+c72da7b9ed57cde6fca2@syzkaller.appspotmail.com
      Fixes: 599be01e ("net_sched: fix an OOB access in cls_tcindex")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d1c3530
    • Cong Wang's avatar
      net_sched: hold rtnl lock in tcindex_partial_destroy_work() · b1be2e8c
      Cong Wang authored
      syzbot reported a use-after-free in tcindex_dump(). This is due to
      the lack of RTNL in the deferred rcu work. We queue this work with
      RTNL in tcindex_change(), later, tcindex_dump() is called:
      
              fh = tp->ops->get(tp, t->tcm_handle);
      	...
              err = tp->ops->change(..., &fh, ...);
              tfilter_notify(..., fh, ...);
      
      but there is nothing to serialize the pending
      tcindex_partial_destroy_work() with tcindex_dump().
      
      Fix this by simply holding RTNL in tcindex_partial_destroy_work(),
      so that it won't be called until RTNL is released after
      tc_new_tfilter() is completed.
      
      Reported-and-tested-by: syzbot+653090db2562495901dc@syzkaller.appspotmail.com
      Fixes: 3d210534 ("net_sched: fix a race condition in tcindex_destroy()")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b1be2e8c
  4. 13 Mar, 2020 4 commits
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-2020-03-13' of... · 94b18a87
      David S. Miller authored
      Merge tag 'wireless-drivers-2020-03-13' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for v5.6
      
      Third, and hopefully last, set of fixes for v5.6.
      
      iwlwifi
      
      * fix a locking issue in time events handling
      
      * a fix in rate-scaling
      
      * fix for a potential NULL pointer deref
      
      * enable antenna diversity in some devices that were erroneously not doing it
      
      * allow FW dumps to continue when the FW is stuck
      
      * a fix in the HE capabilities handling
      
      * another fix for FW dumps where we were reading wrong addresses
      
      * fix link in MAINTAINERS file
      
      rtlwifi
      
      * fix regression causing connect issues in v5.4
      
      wlcore
      
      * remove merge damage which luckily didn't have any impact on functionality
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94b18a87
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 242a6df6
      David S. Miller authored
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2020-03-12
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 12 non-merge commits during the last 8 day(s) which contain
      a total of 12 files changed, 161 insertions(+), 15 deletions(-).
      
      The main changes are:
      
      1) Andrii fixed two bugs in cgroup-bpf.
      
      2) John fixed sockmap.
      
      3) Luke fixed x32 jit.
      
      4) Martin fixed two issues in struct_ops.
      
      5) Yonghong fixed bpf_send_signal.
      
      6) Yoshiki fixed BTF enum.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      242a6df6
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2020-03-13' of git://anongit.freedesktop.org/drm/drm · 0d81a3f2
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "It's a bit quieter, probably not as much as it could be.
      
        There is on large regression fix in here from Lyude for displayport
        bandwidth calculations, there've been reports of multi-monitor in
        docks not working since -rc1 and this has been tested to fix those.
      
        Otherwise it's a bunch of i915 (with some GVT fixes), a set of amdgpu
        watermark + bios fixes, and an exynos iommu cleanup fix.
      
        core:
         - DP MST bandwidth regression fix.
      
        i915:
         - hard lockup fix
         - GVT fixes
         - 32-bit alignment issue fix
         - timeline wait fixes
         - cacheline_retire and free
      
        amdgpu:
         - Update the display watermark bounding box for navi14
         - Fix fetching vbios directly from rom on vega20/arcturus
         - Navi and renoir watermark fixes
      
        exynos:
         - iommu object cleanup fix"
      
      `
      
      * tag 'drm-fixes-2020-03-13' of git://anongit.freedesktop.org/drm/drm:
        drm/dp_mst: Rewrite and fix bandwidth limit checks
        drm/dp_mst: Reprobe path resources in CSN handler
        drm/dp_mst: Use full_pbn instead of available_pbn for bandwidth checks
        drm/dp_mst: Rename drm_dp_mst_is_dp_mst_end_device() to be less redundant
        drm/i915: Defer semaphore priority bumping to a workqueue
        drm/i915/gt: Close race between cacheline_retire and free
        drm/i915/execlists: Enable timeslice on partial virtual engine dequeue
        drm/i915: be more solid in checking the alignment
        drm/i915/gvt: Fix dma-buf display blur issue on CFL
        drm/i915: Return early for await_start on same timeline
        drm/i915: Actually emit the await_start
        drm/amdgpu/powerplay: nv1x, renior copy dcn clock settings of watermark to smu during boot up
        drm/exynos: Fix cleanup of IOMMU related objects
        drm/amdgpu: correct ROM_INDEX/DATA offset for VEGA20
        drm/amd/display: update soc bb for nv14
        drm/i915/gvt: Fix emulated vbt size issue
        drm/i915/gvt: Fix unnecessary schedule timer when no vGPU exits
      0d81a3f2
    • Dave Airlie's avatar
      Merge tag 'topic/mst-bw-check-fixes-for-airlied-2020-03-12-2' of... · 16b78f05
      Dave Airlie authored
      Merge tag 'topic/mst-bw-check-fixes-for-airlied-2020-03-12-2' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
      
      UAPI Changes: None
      
      Cross-subsystem Changes: None
      
      Core Changes: Fixed regressions introduced by commit cd82d82c
      ("drm/dp_mst: Add branch bandwidth validation to MST atomic check"),
      which would cause us to:
      
      * Calculate the available bandwidth on an MST topology incorrectly, and
        as a result reject most display configurations that would try to enable
        more then one sink on a topology
      * Occasionally expose MST connectors to userspace before finishing
        probing their PBN capabilities, resulting in us rejecting display
        configurations because we assumed briefly that no bandwidth was
        available
      
      Driver Changes: None
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Lyude Paul <lyude@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/bf16ee577567beed91c86b7d9cda3ec2e8c50a71.camel@redhat.com
      16b78f05
  5. 12 Mar, 2020 4 commits
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2020-03-12' of... · f31d83f0
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2020-03-12' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      drm/i915 fixes for v5.6-rc6:
      - hard lockup fix
      - GVT fixes
      - 32-bit alignment issue fix
      - timeline wait fixes
      - cacheline_retire and free
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Jani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/87lfo6ksvw.fsf@intel.com
      f31d83f0
    • Dave Airlie's avatar
      Merge tag 'amd-drm-fixes-5.6-2020-03-11' of... · d9443265
      Dave Airlie authored
      Merge tag 'amd-drm-fixes-5.6-2020-03-11' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
      
      amd-drm-fixes-5.6-2020-03-11:
      
      amdgpu:
      - Update the display watermark bounding box for navi14
      - Fix fetching vbios directly from rom on vega20/arcturus
      - Navi and renoir watermark fixes
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Alex Deucher <alexdeucher@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200312020924.4161-1-alexander.deucher@amd.com
      d9443265
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 1b51f694
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "It looks like a decent sized set of fixes, but a lot of these are one
        liner off-by-one and similar type changes:
      
         1) Fix netlink header pointer to calcular bad attribute offset
            reported to user. From Pablo Neira Ayuso.
      
         2) Don't double clear PHY interrupts when ->did_interrupt is set,
            from Heiner Kallweit.
      
         3) Add missing validation of various (devlink, nl802154, fib, etc.)
            attributes, from Jakub Kicinski.
      
         4) Missing *pos increments in various netfilter seq_next ops, from
            Vasily Averin.
      
         5) Missing break in of_mdiobus_register() loop, from Dajun Jin.
      
         6) Don't double bump tx_dropped in veth driver, from Jiang Lidong.
      
         7) Work around FMAN erratum A050385, from Madalin Bucur.
      
         8) Make sure ARP header is pulled early enough in bonding driver,
            from Eric Dumazet.
      
         9) Do a cond_resched() during multicast processing of ipvlan and
            macvlan, from Mahesh Bandewar.
      
        10) Don't attach cgroups to unrelated sockets when in interrupt
            context, from Shakeel Butt.
      
        11) Fix tpacket ring state management when encountering unknown GSO
            types. From Willem de Bruijn.
      
        12) Fix MDIO bus PHY resume by checking mdio_bus_phy_may_suspend()
            only in the suspend context. From Heiner Kallweit"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (112 commits)
        net: systemport: fix index check to avoid an array out of bounds access
        tc-testing: add ETS scheduler to tdc build configuration
        net: phy: fix MDIO bus PM PHY resuming
        net: hns3: clear port base VLAN when unload PF
        net: hns3: fix RMW issue for VLAN filter switch
        net: hns3: fix VF VLAN table entries inconsistent issue
        net: hns3: fix "tc qdisc del" failed issue
        taprio: Fix sending packets without dequeueing them
        net: mvmdio: avoid error message for optional IRQ
        net: dsa: mv88e6xxx: Add missing mask of ATU occupancy register
        net: memcg: fix lockdep splat in inet_csk_accept()
        s390/qeth: implement smarter resizing of the RX buffer pool
        s390/qeth: refactor buffer pool code
        s390/qeth: use page pointers to manage RX buffer pool
        seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number
        net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed
        net/packet: tpacket_rcv: do not increment ring index on drop
        sxgbe: Fix off by one in samsung driver strncpy size arg
        net: caif: Add lockdep expression to RCU traversal primitive
        MAINTAINERS: remove Sathya Perla as Emulex NIC maintainer
        ...
      1b51f694
    • Lyude Paul's avatar
      drm/dp_mst: Rewrite and fix bandwidth limit checks · 047d4cd2
      Lyude Paul authored
      Sigh, this is mostly my fault for not giving commit cd82d82c
      ("drm/dp_mst: Add branch bandwidth validation to MST atomic check")
      enough scrutiny during review. The way we're checking bandwidth
      limitations here is mostly wrong:
      
      For starters, drm_dp_mst_atomic_check_bw_limit() determines the
      pbn_limit of a branch by simply scanning each port on the current branch
      device, then uses the last non-zero full_pbn value that it finds. It
      then counts the sum of the PBN used on each branch device for that
      level, and compares against the full_pbn value it found before.
      
      This is wrong because ports can and will have different PBN limitations
      on many hubs, especially since a number of DisplayPort hubs out there
      will be clever and only use the smallest link rate required for each
      downstream sink - potentially giving every port a different full_pbn
      value depending on what link rate it's trained at. This means with our
      current code, which max PBN value we end up with is not well defined.
      
      Additionally, we also need to remember when checking bandwidth
      limitations that the top-most device in any MST topology is a branch
      device, not a port. This means that the first level of a topology
      doesn't technically have a full_pbn value that needs to be checked.
      Instead, we should assume that so long as our VCPI allocations fit we're
      within the bandwidth limitations of the primary MSTB.
      
      We do however, want to check full_pbn on every port including those of
      the primary MSTB. However, it's important to keep in mind that this
      value represents the minimum link rate /between a port's sink or mstb,
      and the mstb itself/. A quick diagram to explain:
      
                                      MSTB #1
                                     /       \
                                    /         \
                                 Port #1    Port #2
             full_pbn for Port #1 → |          | ← full_pbn for Port #2
                                 Sink #1    MSTB #2
                                               |
                                             etc...
      
      Note that in the above diagram, the combined PBN from all VCPI
      allocations on said hub should not exceed the full_pbn value of port #2,
      and the display configuration on sink #1 should not exceed the full_pbn
      value of port #1. However, port #1 and port #2 can otherwise consume as
      much bandwidth as they want so long as their VCPI allocations still fit.
      
      And finally - our current bandwidth checking code also makes the mistake
      of not checking whether something is an end device or not before trying
      to traverse down it.
      
      So, let's fix it by rewriting our bandwidth checking helpers. We split
      the function into one part for handling branches which simply adds up
      the total PBN on each branch and returns it, and one for checking each
      port to ensure we're not going over its PBN limit. Phew.
      
      This should fix regressions seen, where we erroneously reject display
      configurations due to thinking they're going over our bandwidth limits
      when they're not.
      
      Changes since v1:
      * Took an even closer look at how PBN limitations are supposed to be
        handled, and did some experimenting with Sean Paul. Ended up rewriting
        these helpers again, but this time they should actually be correct!
      Changes since v2:
      * Small indenting fix
      * Fix pbn_used check in drm_dp_mst_atomic_check_port_bw_limit()
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Fixes: cd82d82c ("drm/dp_mst: Add branch bandwidth validation to MST atomic check")
      Cc: Sean Paul <seanpaul@google.com>
      Acked-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Reviewed-by: default avatarMikita Lipski <mikita.lipski@amd.com>
      Tested-by: default avatarHans de Goede <hdegoede@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200309210131.1497545-1-lyude@redhat.com
      047d4cd2