1. 26 Oct, 2023 2 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · c17cda15
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from WiFi and netfilter.
      
        Most regressions addressed here come from quite old versions, with the
        exceptions of the iavf one and the WiFi fixes. No known outstanding
        reports or investigation.
      
        Fixes to fixes:
      
         - eth: iavf: in iavf_down, disable queues when removing the driver
      
        Previous releases - regressions:
      
         - sched: act_ct: additional checks for outdated flows
      
         - tcp: do not leave an empty skb in write queue
      
         - tcp: fix wrong RTO timeout when received SACK reneging
      
         - wifi: cfg80211: pass correct pointer to rdev_inform_bss()
      
         - eth: i40e: sync next_to_clean and next_to_process for programming
           status desc
      
         - eth: iavf: initialize waitqueues before starting watchdog_task
      
        Previous releases - always broken:
      
         - eth: r8169: fix data-races
      
         - eth: igb: fix potential memory leak in igb_add_ethtool_nfc_entry
      
         - eth: r8152: avoid writing garbage to the adapter's registers
      
         - eth: gtp: fix fragmentation needed check with gso"
      
      * tag 'net-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
        iavf: in iavf_down, disable queues when removing the driver
        vsock/virtio: initialize the_virtio_vsock before using VQs
        net: ipv6: fix typo in comments
        net: ipv4: fix typo in comments
        net/sched: act_ct: additional checks for outdated flows
        netfilter: flowtable: GC pushes back packets to classic path
        i40e: Fix wrong check for I40E_TXR_FLAGS_WB_ON_ITR
        gtp: fix fragmentation needed check with gso
        gtp: uapi: fix GTPA_MAX
        Fix NULL pointer dereference in cn_filter()
        sfc: cleanup and reduce netlink error messages
        net/handshake: fix file ref count in handshake_nl_accept_doit()
        wifi: mac80211: don't drop all unprotected public action frames
        wifi: cfg80211: fix assoc response warning on failed links
        wifi: cfg80211: pass correct pointer to rdev_inform_bss()
        isdn: mISDN: hfcsusb: Spelling fix in comment
        tcp: fix wrong RTO timeout when received SACK reneging
        r8152: Block future register access if register access fails
        r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE
        r8152: Check for unplug in r8153b_ups_en() / r8153c_ups_en()
        ...
      c17cda15
    • Michal Schmidt's avatar
      iavf: in iavf_down, disable queues when removing the driver · 53798666
      Michal Schmidt authored
      In iavf_down, we're skipping the scheduling of certain operations if
      the driver is being removed. However, the IAVF_FLAG_AQ_DISABLE_QUEUES
      request must not be skipped in this case, because iavf_close waits
      for the transition to the __IAVF_DOWN state, which happens in
      iavf_virtchnl_completion after the queues are released.
      
      Without this fix, "rmmod iavf" takes half a second per interface that's
      up and prints the "Device resources not yet released" warning.
      
      Fixes: c8de44b5 ("iavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set")
      Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Tested-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Link: https://lore.kernel.org/r/20231025183213.874283-1-jacob.e.keller@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      53798666
  2. 25 Oct, 2023 8 commits
  3. 24 Oct, 2023 8 commits
  4. 23 Oct, 2023 9 commits
    • Pieter Jansen van Vuuren's avatar
      sfc: cleanup and reduce netlink error messages · d788c933
      Pieter Jansen van Vuuren authored
      Reduce the length of netlink error messages as they are likely to be
      truncated anyway. Additionally, reword netlink error messages so they
      are more consistent with previous messages.
      
      Fixes: 9dbc8d2b ("sfc: add decrement ipv6 hop limit by offloading set hop limit actions")
      Fixes: 3c9561c0 ("sfc: support TC decap rules matching on enc_ip_tos")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202310202136.4u7bv0hp-lkp@intel.com/Signed-off-by: default avatarPieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
      Reviewed-by: default avatarEdward Cree <ecree.xilinx@gmail.com>
      Link: https://lore.kernel.org/r/20231020140149.30490-1-pieter.jansen-van-vuuren@amd.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d788c933
    • Linus Torvalds's avatar
      Merge tag 'for-6.6-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · e017769f
      Linus Torvalds authored
      Pull btrfs fix from David Sterba:
       "One more fix for a problem with snapshot of a newly created subvolume
        that can lead to inconsistent data under some circumstances. Kernel
        6.5 added a performance optimization to skip transaction commit for
        subvolume creation but this could end up with newer data on disk but
        not linked to other structures.
      
        The fix itself is an added condition, the rest of the patch is a
        parameter added to several functions"
      
      * tag 'for-6.6-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: fix unwritten extent buffer after snapshotting a new subvolume
      e017769f
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 7c145640
      Linus Torvalds authored
      Pull virtio fixes from Michael Tsirkin:
       "A collection of small fixes that look like worth having in this
        release"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        virtio_pci: fix the common cfg map size
        virtio-crypto: handle config changed by work queue
        vhost: Allow null msg.size on VHOST_IOTLB_INVALIDATE
        vdpa/mlx5: Fix firmware error on creation of 1k VQs
        virtio_balloon: Fix endless deflation and inflation on arm64
        vdpa/mlx5: Fix double release of debugfs entry
        virtio-mmio: fix memory leak of vm_dev
        vdpa_sim_blk: Fix the potential leak of mgmt_dev
        tools/virtio: Add dma sync api for virtio test
      7c145640
    • Moritz Wanzenböck's avatar
      net/handshake: fix file ref count in handshake_nl_accept_doit() · 7798b594
      Moritz Wanzenböck authored
      If req->hr_proto->hp_accept() fail, we call fput() twice:
      Once in the error path, but also a second time because sock->file
      is at that point already associated with the file descriptor. Once
      the task exits, as it would probably do after receiving an error
      reading from netlink, the fd is closed, calling fput() a second time.
      
      To fix, we move installing the file after the error path for the
      hp_accept() call. In the case of errors we simply put the unused fd.
      In case of success we can use fd_install() to link the sock->file
      to the reserved fd.
      
      Fixes: 7ea9c1ec ("net/handshake: Fix handshake_dup() ref counting")
      Signed-off-by: default avatarMoritz Wanzenböck <moritz.wanzenboeck@linbit.com>
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Link: https://lore.kernel.org/r/20231019125847.276443-1-moritz.wanzenboeck@linbit.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7798b594
    • Filipe Manana's avatar
      btrfs: fix unwritten extent buffer after snapshotting a new subvolume · eb96e221
      Filipe Manana authored
      When creating a snapshot of a subvolume that was created in the current
      transaction, we can end up not persisting a dirty extent buffer that is
      referenced by the snapshot, resulting in IO errors due to checksum failures
      when trying to read the extent buffer later from disk. A sequence of steps
      that leads to this is the following:
      
      1) At ioctl.c:create_subvol() we allocate an extent buffer, with logical
         address 36007936, for the leaf/root of a new subvolume that has an ID
         of 291. We mark the extent buffer as dirty, and at this point the
         subvolume tree has a single node/leaf which is also its root (level 0);
      
      2) We no longer commit the transaction used to create the subvolume at
         create_subvol(). We used to, but that was recently removed in
         commit 1b53e51a ("btrfs: don't commit transaction for every subvol
         create");
      
      3) The transaction used to create the subvolume has an ID of 33, so the
         extent buffer 36007936 has a generation of 33;
      
      4) Several updates happen to subvolume 291 during transaction 33, several
         files created and its tree height changes from 0 to 1, so we end up with
         a new root at level 1 and the extent buffer 36007936 is now a leaf of
         that new root node, which is extent buffer 36048896.
      
         The commit root remains as 36007936, since we are still at transaction
         33;
      
      5) Creation of a snapshot of subvolume 291, with an ID of 292, starts at
         ioctl.c:create_snapshot(). This triggers a commit of transaction 33 and
         we end up at transaction.c:create_pending_snapshot(), in the critical
         section of a transaction commit.
      
         There we COW the root of subvolume 291, which is extent buffer 36048896.
         The COW operation returns extent buffer 36048896, since there's no need
         to COW because the extent buffer was created in this transaction and it
         was not written yet.
      
         The we call btrfs_copy_root() against the root node 36048896. During
         this operation we allocate a new extent buffer to turn into the root
         node of the snapshot, copy the contents of the root node 36048896 into
         this snapshot root extent buffer, set the owner to 292 (the ID of the
         snapshot), etc, and then we call btrfs_inc_ref(). This will create a
         delayed reference for each leaf pointed by the root node with a
         reference root of 292 - this includes a reference for the leaf
         36007936.
      
         After that we set the bit BTRFS_ROOT_FORCE_COW in the root's state.
      
         Then we call btrfs_insert_dir_item(), to create the directory entry in
         in the tree of subvolume 291 that points to the snapshot. This ends up
         needing to modify leaf 36007936 to insert the respective directory
         items. Because the bit BTRFS_ROOT_FORCE_COW is set for the root's state,
         we need to COW the leaf. We end up at btrfs_force_cow_block() and then
         at update_ref_for_cow().
      
         At update_ref_for_cow() we call btrfs_block_can_be_shared() which
         returns false, despite the fact the leaf 36007936 is shared - the
         subvolume's root and the snapshot's root point to that leaf. The
         reason that it incorrectly returns false is because the commit root
         of the subvolume is extent buffer 36007936 - it was the initial root
         of the subvolume when we created it. So btrfs_block_can_be_shared()
         which has the following logic:
      
         int btrfs_block_can_be_shared(struct btrfs_root *root,
                                       struct extent_buffer *buf)
         {
             if (test_bit(BTRFS_ROOT_SHAREABLE, &root->state) &&
                 buf != root->node && buf != root->commit_root &&
                 (btrfs_header_generation(buf) <=
                  btrfs_root_last_snapshot(&root->root_item) ||
                  btrfs_header_flag(buf, BTRFS_HEADER_FLAG_RELOC)))
                     return 1;
      
             return 0;
         }
      
         Returns false (0) since 'buf' (extent buffer 36007936) matches the
         root's commit root.
      
         As a result, at update_ref_for_cow(), we don't check for the number
         of references for extent buffer 36007936, we just assume it's not
         shared and therefore that it has only 1 reference, so we set the local
         variable 'refs' to 1.
      
         Later on, in the final if-else statement at update_ref_for_cow():
      
         static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans,
                                                struct btrfs_root *root,
                                                struct extent_buffer *buf,
                                                struct extent_buffer *cow,
                                                int *last_ref)
         {
            (...)
            if (refs > 1) {
                (...)
            } else {
                (...)
                btrfs_clear_buffer_dirty(trans, buf);
                *last_ref = 1;
            }
         }
      
         So we mark the extent buffer 36007936 as not dirty, and as a result
         we don't write it to disk later in the transaction commit, despite the
         fact that the snapshot's root points to it.
      
         Attempting to access the leaf or dumping the tree for example shows
         that the extent buffer was not written:
      
         $ btrfs inspect-internal dump-tree -t 292 /dev/sdb
         btrfs-progs v6.2.2
         file tree key (292 ROOT_ITEM 33)
         node 36110336 level 1 items 2 free space 119 generation 33 owner 292
         node 36110336 flags 0x1(WRITTEN) backref revision 1
         checksum stored a8103e3e
         checksum calced a8103e3e
         fs uuid 90c9a46f-ae9f-4626-9aff-0cbf3e2e3a79
         chunk uuid e8c9c885-78f4-4d31-85fe-89e5f5fd4a07
                 key (256 INODE_ITEM 0) block 36007936 gen 33
                 key (257 EXTENT_DATA 0) block 36052992 gen 33
         checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29
         checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29
         total bytes 107374182400
         bytes used 38572032
         uuid 90c9a46f-ae9f-4626-9aff-0cbf3e2e3a79
      
         The respective on disk region is full of zeroes as the device was
         trimmed at mkfs time.
      
         Obviously 'btrfs check' also detects and complains about this:
      
         $ btrfs check /dev/sdb
         Opening filesystem to check...
         Checking filesystem on /dev/sdb
         UUID: 90c9a46f-ae9f-4626-9aff-0cbf3e2e3a79
         generation: 33 (33)
         [1/7] checking root items
         [2/7] checking extents
         checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29
         checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29
         checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29
         bad tree block 36007936, bytenr mismatch, want=36007936, have=0
         owner ref check failed [36007936 4096]
         ERROR: errors found in extent allocation tree or chunk allocation
         [3/7] checking free space tree
         [4/7] checking fs roots
         checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29
         checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29
         checksum verify failed on 36007936 wanted 0x00000000 found 0x86005f29
         bad tree block 36007936, bytenr mismatch, want=36007936, have=0
         The following tree block(s) is corrupted in tree 292:
              tree block bytenr: 36110336, level: 1, node key: (256, 1, 0)
         root 292 root dir 256 not found
         ERROR: errors found in fs roots
         found 38572032 bytes used, error(s) found
         total csum bytes: 16048
         total tree bytes: 1265664
         total fs tree bytes: 1118208
         total extent tree bytes: 65536
         btree space waste bytes: 562598
         file data blocks allocated: 65978368
          referenced 36569088
      
      Fix this by updating btrfs_block_can_be_shared() to consider that an
      extent buffer may be shared if it matches the commit root and if its
      generation matches the current transaction's generation.
      
      This can be reproduced with the following script:
      
         $ cat test.sh
         #!/bin/bash
      
         MNT=/mnt/sdi
         DEV=/dev/sdi
      
         # Use a filesystem with a 64K node size so that we have the same node
         # size on every machine regardless of its page size (on x86_64 default
         # node size is 16K due to the 4K page size, while on PPC it's 64K by
         # default). This way we can make sure we are able to create a btree for
         # the subvolume with a height of 2.
         mkfs.btrfs -f -n 64K $DEV
         mount $DEV $MNT
      
         btrfs subvolume create $MNT/subvol
      
         # Create a few empty files on the subvolume, this bumps its btree
         # height to 2 (root node at level 1 and 2 leaves).
         for ((i = 1; i <= 300; i++)); do
             echo -n > $MNT/subvol/file_$i
         done
      
         btrfs subvolume snapshot -r $MNT/subvol $MNT/subvol/snap
      
         umount $DEV
      
         btrfs check $DEV
      
      Running it on a 6.5 kernel (or any 6.6-rc kernel at the moment):
      
         $ ./test.sh
         Create subvolume '/mnt/sdi/subvol'
         Create a readonly snapshot of '/mnt/sdi/subvol' in '/mnt/sdi/subvol/snap'
         Opening filesystem to check...
         Checking filesystem on /dev/sdi
         UUID: bbdde2ff-7d02-45ca-8a73-3c36f23755a1
         [1/7] checking root items
         [2/7] checking extents
         parent transid verify failed on 30539776 wanted 7 found 5
         parent transid verify failed on 30539776 wanted 7 found 5
         parent transid verify failed on 30539776 wanted 7 found 5
         Ignoring transid failure
         owner ref check failed [30539776 65536]
         ERROR: errors found in extent allocation tree or chunk allocation
         [3/7] checking free space tree
         [4/7] checking fs roots
         parent transid verify failed on 30539776 wanted 7 found 5
         Ignoring transid failure
         Wrong key of child node/leaf, wanted: (256, 1, 0), have: (2, 132, 0)
         Wrong generation of child node/leaf, wanted: 5, have: 7
         root 257 root dir 256 not found
         ERROR: errors found in fs roots
         found 917504 bytes used, error(s) found
         total csum bytes: 0
         total tree bytes: 851968
         total fs tree bytes: 393216
         total extent tree bytes: 65536
         btree space waste bytes: 736550
         file data blocks allocated: 0
          referenced 0
      
      A test case for fstests will follow soon.
      
      Fixes: 1b53e51a ("btrfs: don't commit transaction for every subvol create")
      CC: stable@vger.kernel.org # 6.5+
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      eb96e221
    • Avraham Stern's avatar
      wifi: mac80211: don't drop all unprotected public action frames · 91535613
      Avraham Stern authored
      Not all public action frames have a protected variant. When MFP is
      enabled drop only public action frames that have a dual protected
      variant.
      
      Fixes: 76a3059c ("wifi: mac80211: drop some unprotected action frames")
      Signed-off-by: default avatarAvraham Stern <avraham.stern@intel.com>
      Signed-off-by: default avatarGregory Greenman <gregory.greenman@intel.com>
      Link: https://lore.kernel.org/r/20231016145213.2973e3c8d3bb.I6198b8d3b04cf4a97b06660d346caec3032f232a@changeidSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      91535613
    • Johannes Berg's avatar
      wifi: cfg80211: fix assoc response warning on failed links · c434b2be
      Johannes Berg authored
      The warning here shouldn't be done before we even set the
      bss field (or should've used the input data). Move the
      assignment before the warning to fix it.
      
      We noticed this now because of Wen's bugfix, where the bug
      fixed there had previously hidden this other bug.
      
      Fixes: 53ad07e9 ("wifi: cfg80211: support reporting failed links")
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      c434b2be
    • Ben Greear's avatar
      wifi: cfg80211: pass correct pointer to rdev_inform_bss() · 3e3929ef
      Ben Greear authored
      Confusing struct member names here resulted in passing
      the wrong pointer, causing crashes. Pass the correct one.
      
      Fixes: eb142608 ("wifi: cfg80211: use a struct for inform_single_bss data")
      Signed-off-by: default avatarBen Greear <greearb@candelatech.com>
      Link: https://lore.kernel.org/r/20231021154827.1142734-1-greearb@candelatech.com
      [rewrite commit message, add fixes]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      3e3929ef
    • Kunwu Chan's avatar
      isdn: mISDN: hfcsusb: Spelling fix in comment · 13454e6e
      Kunwu Chan authored
      protocoll -> protocol
      Signed-off-by: default avatarKunwu Chan <chentao@kylinos.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13454e6e
  5. 22 Oct, 2023 13 commits
    • Linus Torvalds's avatar
      Linux 6.6-rc7 · 05d3ef8b
      Linus Torvalds authored
      05d3ef8b
    • Linus Torvalds's avatar
      Merge tag 'phy-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy · fe3cfe86
      Linus Torvalds authored
      Pull phy fixes from Vinod Koul:
      
       - mapphone-mdm6600 runtime pm & pinctrl handling fixes
      
       - Qualcomm qmp usb pcs register fixes, qmp pcie register size warning
         fix, m31 fixes for wrong pointer in PTR_ERR and dropping wrong vreg
         check, qmp combo fix for 8550 power config register
      
       - realtek usb fix for debugfs_create_dir() and kconfig dependency
      
      * tag 'phy-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
        phy: realtek: Realtek PHYs should depend on ARCH_REALTEK
        phy: qualcomm: Fix typos in comments
        phy: qcom-qmp-combo: initialize PCS_USB registers
        phy: qcom-qmp-combo: Square out 8550 POWER_STATE_CONFIG1
        phy: qcom: m31: Remove unwanted qphy->vreg is NULL check
        phy: realtek: usb: Drop unnecessary error check for debugfs_create_dir()
        phy: qcom: phy-qcom-m31: change m31_ipq5332_regs to static
        phy: qcom: phy-qcom-m31: fix wrong pointer pass to PTR_ERR()
        dt-bindings: phy: qcom,ipq8074-qmp-pcie: fix warning regarding reg size
        phy: qcom-qmp-usb: split PCS_USB init table for sc8280xp and sa8775p
        phy: qcom-qmp-usb: initialize PCS_USB registers
        phy: mapphone-mdm6600: Fix pinctrl_pm handling for sleep pins
        phy: mapphone-mdm6600: Fix runtime PM for remove
        phy: mapphone-mdm6600: Fix runtime disable on probe
      fe3cfe86
    • Linus Torvalds's avatar
      Merge tag 'efi-fixes-for-v6.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · 70e65afc
      Linus Torvalds authored
      Pull EFI fixes from Ard Biesheuvel:
       "The boot_params pointer fix uses a somewhat ugly extern struct
        declaration but this will be cleaned up the next cycle.
      
         - don't try to print warnings to the console when it is no longer
           available
      
         - fix theoretical memory leak in SSDT override handling
      
         - make sure that the boot_params global variable is set before the
           KASLR code attempts to hash it for 'randomness'
      
         - avoid soft lockups in the memory acceptance code"
      
      * tag 'efi-fixes-for-v6.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        efi/unaccepted: Fix soft lockups caused by parallel memory acceptance
        x86/boot: efistub: Assign global boot_params variable
        efi: fix memory leak in krealloc failure handling
        x86/efistub: Don't try to print after ExitBootService()
      70e65afc
    • Fred Chen's avatar
      tcp: fix wrong RTO timeout when received SACK reneging · d2a0fc37
      Fred Chen authored
      This commit fix wrong RTO timeout when received SACK reneging.
      
      When an ACK arrived pointing to a SACK reneging, tcp_check_sack_reneging()
      will rearm the RTO timer for min(1/2*srtt, 10ms) into to the future.
      
      But since the commit 62d9f1a6 ("tcp: fix TLP timer not set when
      CA_STATE changes from DISORDER to OPEN") merged, the tcp_set_xmit_timer()
      is moved after tcp_fastretrans_alert()(which do the SACK reneging check),
      so the RTO timeout will be overwrited by tcp_set_xmit_timer() with
      icsk_rto instead of 1/2*srtt.
      
      Here is a packetdrill script to check this bug:
      0     socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0    bind(3, ..., ...) = 0
      +0    listen(3, 1) = 0
      
      // simulate srtt to 100ms
      +0    < S 0:0(0) win 32792 <mss 1000, sackOK,nop,nop,nop,wscale 7>
      +0    > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      +.1    < . 1:1(0) ack 1 win 1024
      
      +0    accept(3, ..., ...) = 4
      
      +0    write(4, ..., 10000) = 10000
      +0    > P. 1:10001(10000) ack 1
      
      // inject sack
      +.1    < . 1:1(0) ack 1 win 257 <sack 1001:10001,nop,nop>
      +0    > . 1:1001(1000) ack 1
      
      // inject sack reneging
      +.1    < . 1:1(0) ack 1001 win 257 <sack 9001:10001,nop,nop>
      
      // we expect rto fired in 1/2*srtt (50ms)
      +.05    > . 1001:2001(1000) ack 1
      
      This fix remove the FLAG_SET_XMIT_TIMER from ack_flag when
      tcp_check_sack_reneging() set RTO timer with 1/2*srtt to avoid
      being overwrited later.
      
      Fixes: 62d9f1a6 ("tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN")
      Signed-off-by: default avatarFred Chen <fred.chenchen03@gmail.com>
      Reviewed-by: default avatarNeal Cardwell <ncardwell@google.com>
      Tested-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2a0fc37
    • Xiang Chen's avatar
      ACPI: NFIT: Install Notify() handler before getting NFIT table · 9b311b73
      Xiang Chen authored
      If there is no NFIT at startup, it will return 0 immediately in function
      acpi_nfit_add() and will not install Notify() handler. If hotplugging
      a nvdimm device later, it will not be identified as there is no Notify()
      handler.
      
      Install the handler before getting NFI table in function acpi_nfit_add()
      to avoid above issue.
      
      Fixes: dcca12ab ("ACPI: NFIT: Install Notify() handler directly")
      Signed-off-by: default avatarXiang Chen <chenxiang66@hisilicon.com>
      [ rjw: Subject and changelog edits ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      9b311b73
    • David S. Miller's avatar
      Merge branch 'r8152-reg-garbage' · a40614fe
      David S. Miller authored
      Douglas Anderson says:
      
      ====================
      r8152: Avoid writing garbage to the adapter's registers
      
      This series is the result of a cooperative debug effort between
      Realtek and the ChromeOS team. On ChromeOS, we've noticed that Realtek
      Ethernet adapters can sometimes get so wedged that even a reboot of
      the host can't get them to enumerate again, assuming that the adapter
      was on a powered hub and din't lose power when the host rebooted. This
      is sometimes seen in the ChromeOS automated testing lab. The only way
      to recover adapters in this state is to manually power cycle them.
      
      I managed to reproduce one instance of this wedging (unknown if this
      is truly related to what the test lab sees) by doing this:
      1. Start a flood ping from a host to the device.
      2. Drop the device into kdb.
      3. Wait 90 seconds.
      4. Resume from kdb (the "g" command).
      5. Wait another 45 seconds.
      
      Upon analysis, Realtek realized this was happening:
      
      1. The Linux driver was getting a "Tx timeout" after resuming from kdb
         and then trying to reset itself.
      2. As part of the reset, the Linux driver was attempting to do a
         read-modify-write of the adapter's registers.
      3. The read would fail (due to a timeout) and the driver pretended
         that the register contained all 0xFFs. See commit f53a7ad1
         ("r8152: Set memory to all 0xFFs on failed reg reads")
      4. The driver would take this value of all 0xFFs, modify it, and
         attempt to write it back to the adapter.
      5. By this time the USB channel seemed to recover and thus we'd
         successfully write a value that was mostly 0xFFs to the adpater.
      6. The adapter didn't like this and would wedge itself.
      
      Another Engineer also managed to reproduce wedging of the Realtek
      Ethernet adpater during a reboot test on an AMD Chromebook. In that
      case he was sometimes seeing -EPIPE returned from the control
      transfers.
      
      This patch series fixes both issues.
      
      Changes in v5:
      - ("Run the unload routine if we have errors during probe") new for v5.
      - ("Cancel hw_phy_work if we have an error in probe") new for v5.
      - ("Release firmware if we have an error in probe") new for v5.
      - Removed extra mutex_unlock() left over in v4.
      - Fixed minor typos.
      - Don't do queue an unbind/bind reset if probe fails; just retry probe.
      
      Changes in v4:
      - Took out some unnecessary locks/unlocks of the control mutex.
      - Added comment about reading version causing probe fail if 3 fails.
      - Added text to commit msg about the potential unbind/bind loop.
      
      Changes in v3:
      - Fixed v2 changelog ending up in the commit message.
      - farmework -> framework in comments.
      
      Changes in v2:
      - ("Check for unplug in rtl_phy_patch_request()") new for v2.
      - ("Check for unplug in r8153b_ups_en() / r8153c_ups_en()") new for v2.
      - ("Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE") new for v2.
      - Reset patch no longer based on retry patch, since that was dropped.
      - Reset patch should be robust even if failures happen in probe.
      - Switched booleans to bits in the "flags" variable.
      - Check for -ENODEV instead of "udev->state == USB_STATE_NOTATTACHED"
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a40614fe
    • Douglas Anderson's avatar
      r8152: Block future register access if register access fails · d9962b0d
      Douglas Anderson authored
      Even though the functions to read/write registers can fail, most of
      the places in the r8152 driver that read/write register values don't
      check error codes. The lack of error code checking is problematic in
      at least two ways.
      
      The first problem is that the r8152 driver often uses code patterns
      similar to this:
        x = read_register()
        x = x | SOME_BIT;
        write_register(x);
      
      ...with the above pattern, if the read_register() fails and returns
      garbage then we'll end up trying to write modified garbage back to the
      Realtek adapter. If the write_register() succeeds that's bad. Note
      that as of commit f53a7ad1 ("r8152: Set memory to all 0xFFs on
      failed reg reads") the "garbage" returned by read_register() will at
      least be consistent garbage, but it is still garbage.
      
      It turns out that this problem is very serious. Writing garbage to
      some of the hardware registers on the Ethernet adapter can put the
      adapter in such a bad state that it needs to be power cycled (fully
      unplugged and plugged in again) before it can enumerate again.
      
      The second problem is that the r8152 driver generally has functions
      that are long sequences of register writes. Assuming everything will
      be OK if a random register write fails in the middle isn't a great
      assumption.
      
      One might wonder if the above two problems are real. You could ask if
      we would really have a successful write after a failed read. It turns
      out that the answer appears to be "yes, this can happen". In fact,
      we've seen at least two distinct failure modes where this happens.
      
      On a sc7180-trogdor Chromebook if you drop into kdb for a while and
      then resume, you can see:
      1. We get a "Tx timeout"
      2. The "Tx timeout" queues up a USB reset.
      3. In rtl8152_pre_reset() we try to reinit the hardware.
      4. The first several (2-9) register accesses fail with a timeout, then
         things recover.
      
      The above test case was actually fixed by the patch ("r8152: Increase
      USB control msg timeout to 5000ms as per spec") but at least shows
      that we really can see successful calls after failed ones.
      
      On a different (AMD) based Chromebook with a particular adapter, we
      found that during reboot tests we'd also sometimes get a transitory
      failure. In this case we saw -EPIPE being returned sometimes. Retrying
      worked, but retrying is not always safe for all register accesses
      since reading/writing some registers might have side effects (like
      registers that clear on read).
      
      Let's fully lock out all register access if a register access fails.
      When we do this, we'll try to queue up a USB reset and try to unlock
      register access after the reset. This is slightly tricker than it
      sounds since the r8152 driver has an optimized reset sequence that
      only works reliably after probe happens. In order to handle this, we
      avoid the optimized reset if probe didn't finish. Instead, we simply
      retry the probe routine in this case.
      
      When locking out access, we'll use the existing infrastructure that
      the driver was using when it detected we were unplugged. This keeps us
      from getting stuck in delay loops in some parts of the driver.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarGrant Grundler <grundler@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9962b0d
    • Douglas Anderson's avatar
      r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE · 715f67f3
      Douglas Anderson authored
      Whenever the RTL8152_UNPLUG is set that just tells the driver that all
      accesses will fail and we should just immediately bail. A future patch
      will use this same concept at a time when the driver hasn't actually
      been unplugged but is about to be reset. Rename the flag in
      preparation for the future patch.
      
      This is a no-op change and just a search and replace.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarGrant Grundler <grundler@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      715f67f3
    • Douglas Anderson's avatar
      r8152: Check for unplug in r8153b_ups_en() / r8153c_ups_en() · bc65cc42
      Douglas Anderson authored
      If the adapter is unplugged while we're looping in r8153b_ups_en() /
      r8153c_ups_en() we could end up looping for 10 seconds (20 ms * 500
      loops). Add code similar to what's done in other places in the driver
      to check for unplug and bail.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarGrant Grundler <grundler@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc65cc42
    • Douglas Anderson's avatar
      r8152: Check for unplug in rtl_phy_patch_request() · dc90ba37
      Douglas Anderson authored
      If the adapter is unplugged while we're looping in
      rtl_phy_patch_request() we could end up looping for 10 seconds (2 ms *
      5000 loops). Add code similar to what's done in other places in the
      driver to check for unplug and bail.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarGrant Grundler <grundler@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc90ba37
    • Douglas Anderson's avatar
      r8152: Release firmware if we have an error in probe · b8d35024
      Douglas Anderson authored
      The error handling in rtl8152_probe() is missing a call to release
      firmware. Add it in to match what's in the cleanup code in
      rtl8152_disconnect().
      
      Fixes: 9370f2d0 ("r8152: support request_firmware for RTL8153")
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarGrant Grundler <grundler@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b8d35024
    • Douglas Anderson's avatar
      r8152: Cancel hw_phy_work if we have an error in probe · bb8adff9
      Douglas Anderson authored
      The error handling in rtl8152_probe() is missing a call to cancel the
      hw_phy_work. Add it in to match what's in the cleanup code in
      rtl8152_disconnect().
      
      Fixes: a028a9e0 ("r8152: move the settings of PHY to a work queue")
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarGrant Grundler <grundler@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb8adff9
    • Douglas Anderson's avatar
      r8152: Run the unload routine if we have errors during probe · 5dd17689
      Douglas Anderson authored
      The rtl8152_probe() function lacks a call to the chip-specific
      unload() routine when it sees an error in probe. Add it in to match
      the cleanup code in rtl8152_disconnect().
      
      Fixes: ac718b69 ("net/usb: new driver for RTL8152")
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarGrant Grundler <grundler@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5dd17689