1. 04 Jun, 2020 14 commits
    • David Howells's avatar
      afs: Show more a bit more server state in /proc/net/afs/servers · 32275d3f
      David Howells authored
      Display more information about the state of a server record, including the
      flags, rtt and break counter plus the probe state for each server in
      /proc/net/afs/servers.
      
      Rearrange the server flags a bit to make them easier to read at a glance in
      the proc file.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      32275d3f
    • David Howells's avatar
      afs: Don't use probe running state to make decisions outside probe code · f3c130e6
      David Howells authored
      Don't use the running state for fileserver probes to make decisions about
      which server to use as the state is cleared at the start of a probe and
      also intermediate values might be misleading.
      
      Instead, add a separate 'latest known' rtt in the afs_server struct and a
      flag to indicate if the server is known to be responding and update these
      as and when we know what to change them to.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f3c130e6
    • David Howells's avatar
      afs: Fix afs_statfs() to not let the values go below zero · f11a016a
      David Howells authored
      Fix afs_statfs() so that the value for f_bavail and f_bfree don't go
      "negative" if the number of blocks in use by a volume exceeds the max quota
      for that volume.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f11a016a
    • David Howells's avatar
      afs: Fix the by-UUID server tree to allow servers with the same UUID · 3c4c4075
      David Howells authored
      Whilst it shouldn't happen, it is possible for multiple fileservers to
      share a UUID, particularly if an entire cell has been duplicated, UUIDs and
      all.  In such a case, it's not necessarily possible to map the effect of
      the CB.InitCallBackState3 incoming RPC to a specific server unambiguously
      by UUID and thus to a specific cell.
      
      Indeed, there's a problem whereby multiple server records may need to
      occupy the same spot in the rb_tree rooted in the afs_net struct.
      
      Fix this by allowing servers to form a list, with the head of the list in
      the tree.  When the front entry in the list is removed, the second in the
      list just replaces it.  afs_init_callback_state() then just goes down the
      line, poking each server in the list.
      
      This means that some servers will be unnecessarily poked, unfortunately.
      An alternative would be to route by call parameters.
      Reported-by: default avatarJeffrey Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      3c4c4075
    • David Howells's avatar
      afs: Reorganise volume and server trees to be rooted on the cell · 20325960
      David Howells authored
      Reorganise afs_volume objects such that they're in a tree keyed on volume
      ID, rooted at on an afs_cell object rather than being in multiple trees,
      each of which is rooted on an afs_server object.
      
      afs_server structs become per-cell and acquire a pointer to the cell.
      
      The process of breaking a callback then starts with finding the server by
      its network address, following that to the cell and then looking up each
      volume ID in the volume tree.
      
      This is simpler than the afs_vol_interest/afs_cb_interest N:M mapping web
      and allows those structs and the code for maintaining them to be simplified
      or removed.
      
      It does make a couple of things a bit more tricky, though:
      
       (1) Operations now start with a volume, not a server, so there can be more
           than one answer as to whether or not the server we'll end up using
           supports the FS.InlineBulkStatus RPC.
      
       (2) CB RPC operations that specify the server UUID.  There's still a tree
           of servers by UUID on the afs_net struct, but the UUIDs in it aren't
           guaranteed unique.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      20325960
    • David Howells's avatar
      afs: Add a tracepoint to track the lifetime of the afs_volume struct · cca37d45
      David Howells authored
      Add a tracepoint to track the lifetime of the afs_volume struct.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cca37d45
    • David Howells's avatar
      afs: Detect cell aliases 3 - YFS Cells with a canonical cell name op · 6dfdf536
      David Howells authored
      YFS Volume Location servers have an operation by which the cell name may be
      queried.  Use this to find out what a YFS server thinks the canonical cell
      name should be.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      6dfdf536
    • David Howells's avatar
      afs: Detect cell aliases 2 - Cells with no root volumes · 6ef350b1
      David Howells authored
      Implement the second phase of cell alias detection.  This part handles
      alias detection for cells that don't have root.cell volumes and so we have
      to find some other volume or fileserver to query.
      
      We take the first volume from each such cell and attempt to look it up in
      the new cell.  If found, we compare the records, if they are the same, we
      judge the cell names to be aliases.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      6ef350b1
    • David Howells's avatar
      afs: Detect cell aliases 1 - Cells with root volumes · 8a070a96
      David Howells authored
      Put in the first phase of cell alias detection.  This part handles alias
      detection for cells that have root.cell volumes (which is expected to be
      likely).
      
      When a cell becomes newly active, it is probed for its root.cell volume,
      and if it has one, this volume is compared against other root.cell volumes
      to find out if the list of fileserver UUIDs have any in common - and if
      that's the case, do the address lists of those fileservers have any
      addresses in common.  If they do, the new cell is adjudged to be an alias
      of the old cell and the old cell is used instead.
      
      Comparing is aided by the server list in struct afs_server_list being
      sorted in UUID order and the addresses in the fileserver address lists
      being sorted in address order.
      
      The cell then retains the afs_volume object for the root.cell volume, even
      if it's not mounted for future alias checking.
      
      This necessary because:
      
       (1) Whilst fileservers have UUIDs that are meant to be globally unique, in
           practice they are not because cells get cloned without changing the
           UUIDs - so afs_server records need to be per cell.
      
       (2) Sometimes the DNS is used to make cell aliases - but if we don't know
           they're the same, we may end up with multiple superblocks and multiple
           afs_server records for the same thing, impairing our ability to
           deliver callback notifications of third party changes
      
       (3) The fileserver RPC API doesn't contain the cell name, so it can't tell
           us which cell it's notifying and can't see that a change made to to
           one cell should notify the same client that's also accessed as the
           other cell.
      Reported-by: default avatarJeffrey Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      8a070a96
    • David Howells's avatar
      afs: Implement client support for the YFSVL.GetCellName RPC op · c3e9f888
      David Howells authored
      Implement client support for the YFSVL.GetCellName RPC operation by which
      YFS permits the canonical cell name to be queried from a VL server.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c3e9f888
    • David Howells's avatar
      afs: Retain more of the VLDB record for alias detection · 194d28cf
      David Howells authored
      Save more bits from the volume location database record obtained for a
      server so that we can use this information in cell alias detection.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      194d28cf
    • David Howells's avatar
      afs: Fix handling of CB.ProbeUuid cache manager op · 3120c170
      David Howells authored
      The AFS filesystem driver is handling the CB.ProbeUuid request incorrectly.
      The UUID presented in the request is that of the cache manager, not the
      fileserver, so afs_deliver_cb_probe_uuid() shouldn't be using that UUID to
      look up the server.
      
      Fix this by looking up the server by address instead.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      3120c170
    • David Howells's avatar
      afs: Don't get epoch from a server because it may be ambiguous · 44746355
      David Howells authored
      Don't get the epoch from a server, particularly one that we're looking up
      by UUID, as UUIDs may be ambiguous and may map to more than one server - so
      we can't draw any conclusions from it.
      Reported-by: default avatarJeffrey Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      44746355
    • David Howells's avatar
      afs: Build an abstraction around an "operation" concept · e49c7b2f
      David Howells authored
      Turn the afs_operation struct into the main way that most fileserver
      operations are managed.  Various things are added to the struct, including
      the following:
      
       (1) All the parameters and results of the relevant operations are moved
           into it, removing corresponding fields from the afs_call struct.
           afs_call gets a pointer to the op.
      
       (2) The target volume is made the main focus of the operation, rather than
           the target vnode(s), and a bunch of op->vnode->volume are made
           op->volume instead.
      
       (3) Two vnode records are defined (op->file[]) for the vnode(s) involved
           in most operations.  The vnode record (struct afs_vnode_param)
           contains:
      
      	- The vnode pointer.
      
      	- The fid of the vnode to be included in the parameters or that was
                returned in the reply (eg. FS.MakeDir).
      
      	- The status and callback information that may be returned in the
           	  reply about the vnode.
      
      	- Callback break and data version tracking for detecting
                simultaneous third-parth changes.
      
       (4) Pointers to dentries to be updated with new inodes.
      
       (5) An operations table pointer.  The table includes pointers to functions
           for issuing AFS and YFS-variant RPCs, handling the success and abort
           of an operation and handling post-I/O-lock local editing of a
           directory.
      
      To make this work, the following function restructuring is made:
      
       (A) The rotation loop that issues calls to fileservers that can be found
           in each function that wants to issue an RPC (such as afs_mkdir()) is
           extracted out into common code, in a new file called fs_operation.c.
      
       (B) The rotation loops, such as the one in afs_mkdir(), are replaced with
           a much smaller piece of code that allocates an operation, sets the
           parameters and then calls out to the common code to do the actual
           work.
      
       (C) The code for handling the success and failure of an operation are
           moved into operation functions (as (5) above) and these are called
           from the core code at appropriate times.
      
       (D) The pseudo inode getting stuff used by the dynamic root code is moved
           over into dynroot.c.
      
       (E) struct afs_iget_data is absorbed into the operation struct and
           afs_iget() expects to be given an op pointer and a vnode record.
      
       (F) Point (E) doesn't work for the root dir of a volume, but we know the
           FID in advance (it's always vnode 1, unique 1), so a separate inode
           getter, afs_root_iget(), is provided to special-case that.
      
       (G) The inode status init/update functions now also take an op and a vnode
           record.
      
       (H) The RPC marshalling functions now, for the most part, just take an
           afs_operation struct as their only argument.  All the data they need
           is held there.  The result delivery functions write their answers
           there as well.
      
       (I) The call is attached to the operation and then the operation core does
           the waiting.
      
      And then the new operation code is, for the moment, made to just initialise
      the operation, get the appropriate vnode I/O locks and do the same rotation
      loop as before.
      
      This lays the foundation for the following changes in the future:
      
       (*) Overhauling the rotation (again).
      
       (*) Support for asynchronous I/O, where the fileserver rotation must be
           done asynchronously also.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      e49c7b2f
  2. 31 May, 2020 12 commits
    • David Howells's avatar
      afs: Rename struct afs_fs_cursor to afs_operation · a310082f
      David Howells authored
      As a prelude to implementing asynchronous fileserver operations in the afs
      filesystem, rename struct afs_fs_cursor to afs_operation.
      
      This struct is going to form the core of the operation management and is
      going to acquire more members in later.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      a310082f
    • David Howells's avatar
      afs: Remove the error argument from afs_protocol_error() · 7126ead9
      David Howells authored
      Remove the error argument from afs_protocol_error() as it's always
      -EBADMSG.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      7126ead9
    • David Howells's avatar
      afs: Set error flag rather than return error from file status decode · 38355eec
      David Howells authored
      Set a flag in the call struct to indicate an unmarshalling error rather
      than return and handle an error from the decoding of file statuses.  This
      flag is checked on a successful return from the delivery function.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      38355eec
    • David Howells's avatar
      afs: Make callback processing more efficient. · 8230fd82
      David Howells authored
      afs_vol_interest objects represent the volume IDs currently being accessed
      from a fileserver.  These hold lists of afs_cb_interest objects that
      repesent the superblocks using that volume ID on that server.
      
      When a callback notification from the server telling of a modification by
      another client arrives, the volume ID specified in the notification is
      looked up in the server's afs_vol_interest list.  Through the
      afs_cb_interest list, the relevant superblocks can be iterated over and the
      specific inode looked up and marked in each one.
      
      Make the following efficiency improvements:
      
       (1) Hold rcu_read_lock() over the entire processing rather than locking it
           each time.
      
       (2) Do all the callbacks for each vid together rather than individually.
           Each volume then only needs to be looked up once.
      
       (3) afs_vol_interest objects are now stored in an rb_tree rather than a
           flat list to reduce the lookup step count.
      
       (4) afs_vol_interest lookup is now done with RCU, but because it's in an
           rb_tree which may rotate under us, a seqlock is used so that if it
           changes during the walk, we repeat the walk with a lock held.
      
      With this and the preceding patch which adds RCU-based lookups in the inode
      cache, target volumes/vnodes can be taken without the need to take any
      locks, except on the target itself.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      8230fd82
    • David Howells's avatar
      afs: Show more information in /proc/net/afs/servers · 6d043a57
      David Howells authored
      Show more information in /proc/net/afs/servers to make it easier to see
      what's going on with the server probing.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      6d043a57
    • David Howells's avatar
      afs: Actively poll fileservers to maintain NAT or firewall openings · f6cbb368
      David Howells authored
      When an AFS client accesses a file, it receives a limited-duration callback
      promise that the server will notify it if another client changes a file.
      This callback duration can be a few hours in length.
      
      If a client mounts a volume and then an application prevents it from being
      unmounted, say by chdir'ing into it, but then does nothing for some time,
      the rxrpc_peer record will expire and rxrpc-level keepalive will cease.
      
      If there is NAT or a firewall between the client and the server, the route
      back for the server may close after a comparatively short duration, meaning
      that attempts by the server to notify the client may then bounce.
      
      The client, however, may (so far as it knows) still have a valid unexpired
      promise and will then rely on its cached data and will not see changes made
      on the server by a third party until it incidentally rechecks the status or
      the promise needs renewal.
      
      To deal with this, the client needs to regularly probe the server.  This
      has two effects: firstly, it keeps a route open back for the server, and
      secondly, it causes the server to disgorge any notifications that got
      queued up because they couldn't be sent.
      
      Fix this by adding a mechanism to emit regular probes.
      
      Two levels of probing are made available: Under normal circumstances the
      'slow' queue will be used for a fileserver - this just probes the preferred
      address once every 5 mins or so; however, if server fails to respond to any
      probes, the server will shift to the 'fast' queue from which all its
      interfaces will be probed every 30s.  When it finally responds, the record
      will switch back to the slow queue.
      
      Further notes:
      
       (1) Probing is now no longer driven from the fileserver rotation
           algorithm.
      
       (2) Probes are dispatched to all interfaces on a fileserver when that an
           afs_server object is set up to record it.
      
       (3) The afs_server object is removed from the probe queues when we start
           to probe it.  afs_is_probing_server() returns true if it's not listed
           - ie. it's undergoing probing.
      
       (4) The afs_server object is added back on to the probe queue when the
           final outstanding probe completes, but the probed_at time is set when
           we're about to launch a probe so that it's not dependent on the probe
           duration.
      
       (5) The timer and the work item added for this must be handed a count on
           net->servers_outstanding, which they hand on or release.  This makes
           sure that network namespace cleanup waits for them.
      
      Fixes: d2ddc776 ("afs: Overhaul volume and server record caching and fileserver rotation")
      Reported-by: default avatarDave Botsch <botsch@cnf.cornell.edu>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f6cbb368
    • David Howells's avatar
      afs: Split the usage count on struct afs_server · 977e5f8e
      David Howells authored
      Split the usage count on the afs_server struct to have an active count that
      registers who's actually using it separately from the reference count on
      the object.
      
      This allows a future patch to dispatch polling probes without advancing the
      "unuse" time into the future each time we emit a probe, which would
      otherwise prevent unused server records from expiring.
      
      Included in this:
      
       (1) The latter part of afs_destroy_server() in which the RCU destruction
           of afs_server objects is invoked and the outstanding server count is
           decremented is split out into __afs_put_server().
      
       (2) afs_put_server() now calls __afs_put_server() rather then setting the
           management timer.
      
       (3) The calls begun by afs_fs_give_up_all_callbacks() and
           afs_fs_get_capabilities() can now take a ref on the server record, so
           afs_destroy_server() can just drop its ref and needn't wait for the
           completion of these calls.  They'll put the ref when they're done.
      
       (4) Because of (3), afs_fs_probe_done() no longer needs to wake up
           afs_destroy_server() with server->probe_outstanding.
      
       (5) afs_gc_servers can be simplified.  It only needs to check if
           server->active is 0 rather than playing games with the refcount.
      
       (6) afs_manage_servers() can propose a server for gc if usage == 0 rather
           than if ref == 1.  The gc is effected by (5).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      977e5f8e
    • David Howells's avatar
      afs: Use the serverUnique field in the UVLDB record to reduce rpc ops · 81006805
      David Howells authored
      The U-version VLDB volume record retrieved by the VL.GetEntryByNameU rpc op
      carries a change counter (the serverUnique field) for each fileserver
      listed in the record as backing that volume.  This is incremented whenever
      the registration details for a fileserver change (such as its address
      list).  Note that the same value will be seen in all UVLDB records that
      refer to that fileserver.
      
      This should be checked before calling the VL server to re-query the address
      list for a fileserver.  If it's the same, there's no point doing the query.
      Reported-by: default avatarJeffrey Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      81006805
    • David Howells's avatar
      afs: Always include dir in bulk status fetch from afs_do_lookup() · 13fcc635
      David Howells authored
      When a lookup is done in an AFS directory, the filesystem will speculate
      and fetch up to 49 other statuses for files in the same directory and fetch
      those as well, turning them into inodes or updating inodes that already
      exist.
      
      However, occasionally, a callback break might go missing due to NAT timing
      out, but the afs filesystem doesn't then realise that the directory is not
      up to date.
      
      Alleviate this by using one of the status slots to check the directory in
      which the lookup is being done.
      Reported-by: default avatarDave Botsch <botsch@cnf.cornell.edu>
      Suggested-by: default avatarJeffrey Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      13fcc635
    • David Howells's avatar
      rxrpc: Adjust /proc/net/rxrpc/calls to display call->debug_id not user_ID · 32f71aa4
      David Howells authored
      The user ID value isn't actually much use - and leaks a kernel pointer or a
      userspace value - so replace it with the call debug ID, which appears in trace
      points.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      32f71aa4
    • David Howells's avatar
      rxrpc: Map the EACCES error produced by some ICMP6 to EHOSTUNREACH · 23e2db31
      David Howells authored
      Map the EACCES error that is produced by some ICMP6 packets to EHOSTUNREACH
      when we get them as EACCES has other meanings within a filesystem context.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      23e2db31
    • David Howells's avatar
      vfs, afs, ext4: Make the inode hash table RCU searchable · 3f19b2ab
      David Howells authored
      Make the inode hash table RCU searchable so that searches that want to
      access or modify an inode without taking a ref on that inode can do so
      without taking the inode hash table lock.
      
      The main thing this requires is some RCU annotation on the list
      manipulation operations.  Inodes are already freed by RCU in most cases.
      
      Users of this interface must take care as the inode may be still under
      construction or may be being torn down around them.
      
      There are at least three instances where this can be of use:
      
       (1) Testing whether the inode number iunique() is going to return is
           currently unique (the iunique_lock is still held).
      
       (2) Ext4 date stamp updating.
      
       (3) AFS callback breaking.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      cc: linux-ext4@vger.kernel.org
      cc: linux-afs@lists.infradead.org
      3f19b2ab
  3. 24 May, 2020 5 commits
    • Linus Torvalds's avatar
      Linux 5.7-rc7 · 9cb1fd0e
      Linus Torvalds authored
      9cb1fd0e
    • Linus Torvalds's avatar
      Merge tag 'efi-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 98790bba
      Linus Torvalds authored
      Pull EFI fixes from Thomas Gleixner:
       "A set of EFI fixes:
      
         - Don't return a garbage screen info when EFI framebuffer is not
           available
      
         - Make the early EFI console work properly with wider fonts instead
           of drawing garbage
      
         - Prevent a memory buffer leak in allocate_e820()
      
         - Print the firmware error record properly so it can be decoded by
           users
      
         - Fix a symbol clash in the host tool build which only happens with
           newer compilers.
      
         - Add a missing check for the event log version of TPM which caused
           boot failures on several Dell systems due to an attempt to decode
           SHA-1 format with the crypto agile algorithm"
      
      * tag 'efi-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tpm: check event log version before reading final events
        efi: Pull up arch-specific prototype efi_systab_show_arch()
        x86/boot: Mark global variables as static
        efi: cper: Add support for printing Firmware Error Record Reference
        efi/libstub/x86: Avoid EFI map buffer alloc in allocate_e820()
        efi/earlycon: Fix early printk for wider fonts
        efi/libstub: Avoid returning uninitialized data from setup_graphics()
      98790bba
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 667b6249
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "Two fixes for x86:
      
         - Unbreak stack dumps for inactive tasks by interpreting the special
           first frame left by __switch_to_asm() correctly.
      
           The recent change not to skip the first frame so ORC and frame
           unwinder behave in the same way caused all entries to be
           unreliable, i.e. prepended with '?'.
      
         - Use cpumask_available() instead of an implicit NULL check of a
           cpumask_var_t in mmio trace to prevent a Clang build warning"
      
      * tag 'x86-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasks
        x86/mmiotrace: Use cpumask_available() for cpumask_var_t variables
      667b6249
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9e61d12b
      Linus Torvalds authored
      Pull scheduler fixes from Thomas Gleixner:
       "A set of fixes for the scheduler:
      
         - Fix handling of throttled parents in enqueue_task_fair() completely.
      
           The recent fix overlooked a corner case where the first iteration
           terminates due to an entity already being on the runqueue which
           makes the list management incomplete and later triggers the
           assertion which checks for completeness.
      
         - Fix a similar problem in unthrottle_cfs_rq().
      
         - Show the correct uclamp values in procfs which prints the effective
           value twice instead of requested and effective"
      
      * tag 'sched-urgent-2020-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list
        sched/debug: Fix requested task uclamp values shown in procfs
        sched/fair: Fix enqueue_task_fair() warning some more
      9e61d12b
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · caffb99b
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix RCU warnings in ipv6 multicast router code, from Madhuparna
          Bhowmik.
      
       2) Nexthop attributes aren't being checked properly because of
          mis-initialized iterator, from David Ahern.
      
       3) Revert iop_idents_reserve() change as it caused performance
          regressions and was just working around what is really a UBSAN bug
          in the compiler. From Yuqi Jin.
      
       4) Read MAC address properly from ROM in bmac driver (double iteration
          proceeds past end of address array), from Jeremy Kerr.
      
       5) Add Microsoft Surface device IDs to r8152, from Marc Payne.
      
       6) Prevent reference to freed SKB in __netif_receive_skb_core(), from
          Boris Sukholitko.
      
       7) Fix ACK discard behavior in rxrpc, from David Howells.
      
       8) Preserve flow hash across packet scrubbing in wireguard, from Jason
          A. Donenfeld.
      
       9) Cap option length properly for SO_BINDTODEVICE in AX25, from Eric
          Dumazet.
      
      10) Fix encryption error checking in kTLS code, from Vadim Fedorenko.
      
      11) Missing BPF prog ref release in flow dissector, from Jakub Sitnicki.
      
      12) dst_cache must be used with BH disabled in tipc, from Eric Dumazet.
      
      13) Fix use after free in mlxsw driver, from Jiri Pirko.
      
      14) Order kTLS key destruction properly in mlx5 driver, from Tariq
          Toukan.
      
      15) Check devm_platform_ioremap_resource() return value properly in
          several drivers, from Tiezhu Yang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits)
        net: smsc911x: Fix runtime PM imbalance on error
        net/mlx4_core: fix a memory leak bug.
        net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend
        net: phy: mscc: fix initialization of the MACsec protocol mode
        net: stmmac: don't attach interface until resume finishes
        net: Fix return value about devm_platform_ioremap_resource()
        net/mlx5: Fix error flow in case of function_setup failure
        net/mlx5e: CT: Correctly get flow rule
        net/mlx5e: Update netdev txq on completions during closure
        net/mlx5: Annotate mutex destroy for root ns
        net/mlx5: Don't maintain a case of del_sw_func being null
        net/mlx5: Fix cleaning unmanaged flow tables
        net/mlx5: Fix memory leak in mlx5_events_init
        net/mlx5e: Fix inner tirs handling
        net/mlx5e: kTLS, Destroy key object after destroying the TIS
        net/mlx5e: Fix allowed tc redirect merged eswitch offload cases
        net/mlx5: Avoid processing commands before cmdif is ready
        net/mlx5: Fix a race when moving command interface to events mode
        net/mlx5: Add command entry handling completion
        rxrpc: Fix a memory leak in rxkad_verify_response()
        ...
      caffb99b
  4. 23 May, 2020 9 commits
    • Dinghao Liu's avatar
      net: smsc911x: Fix runtime PM imbalance on error · 539d39ad
      Dinghao Liu authored
      Remove runtime PM usage counter decrement when the
      increment function has not been called to keep the
      counter balanced.
      Signed-off-by: default avatarDinghao Liu <dinghao.liu@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      539d39ad
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2020-05-22' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · e3181e9a
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      mlx5 fixes 2020-05-22
      
      This series introduces some fixes to mlx5 driver.
      
      Please pull and let me know if there is any problem.
      
      For -stable v4.13
         ('net/mlx5: Add command entry handling completion')
      
      For -stable v5.2
         ('net/mlx5: Fix error flow in case of function_setup failure')
         ('net/mlx5: Fix memory leak in mlx5_events_init')
      
      For -stable v5.3
         ('net/mlx5e: Update netdev txq on completions during closure')
         ('net/mlx5e: kTLS, Destroy key object after destroying the TIS')
         ('net/mlx5e: Fix inner tirs handling')
      
      For -stable v5.6
         ('net/mlx5: Fix cleaning unmanaged flow tables')
         ('net/mlx5: Fix a race when moving command interface to events mode')
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3181e9a
    • Qiushi Wu's avatar
      net/mlx4_core: fix a memory leak bug. · febfd9d3
      Qiushi Wu authored
      In function mlx4_opreq_action(), pointer "mailbox" is not released,
      when mlx4_cmd_box() return and error, causing a memory leak bug.
      Fix this issue by going to "out" label, mlx4_free_cmd_mailbox() can
      free this pointer.
      
      Fixes: fe6f700d ("net/mlx4_core: Respond to operation request by firmware")
      Signed-off-by: default avatarQiushi Wu <wu000273@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      febfd9d3
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend · 4c64b83d
      Grygorii Strashko authored
      vlan_for_each() are required to be called with rtnl_lock taken, otherwise
      ASSERT_RTNL() warning will be triggered - which happens now during System
      resume from suspend:
        cpsw_suspend()
        |- cpsw_ndo_stop()
          |- __hw_addr_ref_unsync_dev()
            |- cpsw_purge_all_mc()
               |- vlan_for_each()
                  |- ASSERT_RTNL();
      
      Hence, fix it by surrounding cpsw_ndo_stop() by rtnl_lock/unlock() calls.
      
      Fixes: 15180eca ("net: ethernet: ti: cpsw: fix vlan mcast")
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c64b83d
    • Antoine Tenart's avatar
      net: phy: mscc: fix initialization of the MACsec protocol mode · 0ddfee1f
      Antoine Tenart authored
      At the very end of the MACsec block initialization in the MSCC PHY
      driver, the MACsec "protocol mode" is set. This setting should be set
      based on the PHY id within the package, as the bank used to access the
      register used depends on this. This was not done correctly, and only the
      first bank was used leading to the two upper PHYs being unstable when
      using the VSC8584. This patch fixes it.
      
      Fixes: 1bbe0ecc ("net: phy: mscc: macsec initialization")
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ddfee1f
    • Leon Yu's avatar
      net: stmmac: don't attach interface until resume finishes · 31096c3e
      Leon Yu authored
      Commit 14b41a29 ("net: stmmac: Delete txtimer in suspend") was the
      first attempt to fix a race between mod_timer() and setup_timer()
      during stmmac_resume(). However the issue still exists as the commit
      only addressed half of the issue.
      
      Same race can still happen as stmmac_resume() re-attaches interface
      way too early - even before hardware is fully initialized.  Worse,
      doing so allows network traffic to restart and stmmac_tx_timer_arm()
      being called in the middle of stmmac_resume(), which re-init tx timers
      in stmmac_init_coalesce().  timer_list will be corrupted and system
      crashes as a result of race between mod_timer() and setup_timer().
      
        systemd--1995    2.... 552950018us : stmmac_suspend: 4994
        ksoftirq-9       0..s2 553123133us : stmmac_tx_timer_arm: 2276
        systemd--1995    0.... 553127896us : stmmac_resume: 5101
        systemd--320     7...2 553132752us : stmmac_tx_timer_arm: 2276
        (sd-exec-1999    5...2 553135204us : stmmac_tx_timer_arm: 2276
        ---------------------------------
        pc : run_timer_softirq+0x468/0x5e0
        lr : run_timer_softirq+0x570/0x5e0
        Call trace:
         run_timer_softirq+0x468/0x5e0
         __do_softirq+0x124/0x398
         irq_exit+0xd8/0xe0
         __handle_domain_irq+0x6c/0xc0
         gic_handle_irq+0x60/0xb0
         el1_irq+0xb8/0x180
         arch_cpu_idle+0x38/0x230
         default_idle_call+0x24/0x3c
         do_idle+0x1e0/0x2b8
         cpu_startup_entry+0x28/0x48
         secondary_start_kernel+0x1b4/0x208
      
      Fix this by deferring netif_device_attach() to the end of
      stmmac_resume().
      Signed-off-by: default avatarLeon Yu <leoyu@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31096c3e
    • Tiezhu Yang's avatar
      net: Fix return value about devm_platform_ioremap_resource() · ef24d6c3
      Tiezhu Yang authored
      When call function devm_platform_ioremap_resource(), we should use IS_ERR()
      to check the return value and return PTR_ERR() if failed.
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef24d6c3
    • Mike Rapoport's avatar
      sparc32: fix page table traversal in srmmu_nocache_init() · 0cfc8a8d
      Mike Rapoport authored
      The srmmu_nocache_init() uses __nocache_fix() macro to add an offset to
      page table entry to access srmmu_nocache_pool.
      
      But since sparc32 has only three actual page table levels, pgd, p4d and
      pud are essentially the same thing and pgd_offset() and p4d_offset() are
      no-ops, the __nocache_fix() should be done only at PUD level.
      
      Remove __nocache_fix() for p4d_offset() and pud_offset() and keep it
      only for PUD and lower levels.
      
      Fixes: c2bc26f7 ("sparc32: use PUD rather than PGD to get PMD in srmmu_nocache_init()")
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Anatoly Pugachev <matorola@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0cfc8a8d
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 423b8baf
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "11 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        MAINTAINERS: add files related to kdump
        z3fold: fix use-after-free when freeing handles
        sparc32: use PUD rather than PGD to get PMD in srmmu_nocache_init()
        MAINTAINERS: update email address for Naoya Horiguchi
        sh: include linux/time_types.h for sockios
        kasan: disable branch tracing for core runtime
        selftests/vm/write_to_hugetlbfs.c: fix unused variable warning
        selftests/vm/.gitignore: add mremap_dontunmap
        rapidio: fix an error in get_user_pages_fast() error handling
        x86: bitops: fix build regression
        device-dax: don't leak kernel memory to user space after unloading kmem
      423b8baf