1. 13 Dec, 2021 14 commits
    • NeilBrown's avatar
      lockd: move lockd_start_svc() call into lockd_create_svc() · b73a2972
      NeilBrown authored
      lockd_start_svc() only needs to be called once, just after the svc is
      created.  If the start fails, the svc is discarded too.
      
      It thus makes sense to call lockd_start_svc() from lockd_create_svc().
      This allows us to remove the test against nlmsvc_rqst at the start of
      lockd_start_svc() - it must always be NULL.
      
      lockd_up() only held an extra reference on the svc until a thread was
      created - then it dropped it.  The thread - and thus the extra reference
      - will remain until kthread_stop() is called.
      Now that the thread is created in lockd_create_svc(), the extra
      reference can be dropped there.  So the 'serv' variable is no longer
      needed in lockd_up().
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      b73a2972
    • NeilBrown's avatar
      lockd: simplify management of network status notifiers · 5a8a7ff5
      NeilBrown authored
      Now that the network status notifiers use nlmsvc_serv rather then
      nlmsvc_rqst the management can be simplified.
      
      Notifier unregistration synchronises with any pending notifications so
      providing we unregister before nlm_serv is freed no further interlock
      is required.
      
      So we move the unregister call to just before the thread is killed
      (which destroys the service) and just before the service is destroyed in
      the failure-path of lockd_up().
      
      Then nlm_ntf_refcnt and nlm_ntf_wq can be removed.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      5a8a7ff5
    • NeilBrown's avatar
      lockd: introduce nlmsvc_serv · 2840fe86
      NeilBrown authored
      lockd has two globals - nlmsvc_task and nlmsvc_rqst - but mostly it
      wants the 'struct svc_serv', and when it doesn't want it exactly it can
      get to what it wants from the serv.
      
      This patch is a first step to removing nlmsvc_task and nlmsvc_rqst.  It
      introduces nlmsvc_serv to store the 'struct svc_serv*'.  This is set as
      soon as the serv is created, and cleared only when it is destroyed.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      2840fe86
    • NeilBrown's avatar
      NFSD: simplify locking for network notifier. · d057cfec
      NeilBrown authored
      nfsd currently maintains an open-coded read/write semaphore (refcount
      and wait queue) for each network namespace to ensure the nfs service
      isn't shut down while the notifier is running.
      
      This is excessive.  As there is unlikely to be contention between
      notifiers and they run without sleeping, a single spinlock is sufficient
      to avoid problems.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      [ cel: ensure nfsd_notifier_lock is static ]
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      d057cfec
    • NeilBrown's avatar
      SUNRPC: discard svo_setup and rename svc_set_num_threads_sync() · 3ebdbe52
      NeilBrown authored
      The ->svo_setup callback serves no purpose.  It is always called from
      within the same module that chooses which callback is needed.  So
      discard it and call the relevant function directly.
      
      Now that svc_set_num_threads() is no longer used remove it and rename
      svc_set_num_threads_sync() to remove the "_sync" suffix.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      3ebdbe52
    • NeilBrown's avatar
      NFSD: Make it possible to use svc_set_num_threads_sync · 3409e4f1
      NeilBrown authored
      nfsd cannot currently use svc_set_num_threads_sync.  It instead
      uses svc_set_num_threads which does *not* wait for threads to all
      exit, and has a separate mechanism (nfsd_shutdown_complete) to wait
      for completion.
      
      The reason that nfsd is unlike other services is that nfsd threads can
      exit separately from svc_set_num_threads being called - they die on
      receipt of SIGKILL.  Also, when the last thread exits, the service must
      be shut down (sockets closed).
      
      For this, the nfsd_mutex needs to be taken, and as that mutex needs to
      be held while svc_set_num_threads is called, the one cannot wait for
      the other.
      
      This patch changes the nfsd thread so that it can drop the ref on the
      service without blocking on nfsd_mutex, so that svc_set_num_threads_sync
      can be used:
       - if it can drop a non-last reference, it does that.  This does not
         trigger shutdown and does not require a mutex.  This will likely
         happen for all but the last thread signalled, and for all threads
         being shut down by nfsd_shutdown_threads()
       - if it can get the mutex without blocking (trylock), it does that
         and then drops the reference.  This will likely happen for the
         last thread killed by SIGKILL
       - Otherwise there might be an unrelated task holding the mutex,
         possibly in another network namespace, or nfsd_shutdown_threads()
         might be just about to get a reference on the service, after which
         we can drop ours safely.
         We cannot conveniently get wakeup notifications on these events,
         and we are unlikely to need to, so we sleep briefly and check again.
      
      With this we can discard nfsd_shutdown_complete and
      nfsd_complete_shutdown(), and switch to svc_set_num_threads_sync.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      3409e4f1
    • NeilBrown's avatar
      NFSD: narrow nfsd_mutex protection in nfsd thread · 9d3792ae
      NeilBrown authored
      There is nothing happening in the start of nfsd() that requires
      protection by the mutex, so don't take it until shutting down the thread
      - which does still require protection - but only for nfsd_put().
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      9d3792ae
    • NeilBrown's avatar
      SUNRPC: use sv_lock to protect updates to sv_nrthreads. · 2a36395f
      NeilBrown authored
      Using sv_lock means we don't need to hold the service mutex over these
      updates.
      
      In particular,  svc_exit_thread() no longer requires synchronisation, so
      threads can exit asynchronously.
      
      Note that we could use an atomic_t, but as there are many more read
      sites than writes, that would add unnecessary noise to the code.
      Some reads are already racy, and there is no need for them to not be.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      2a36395f
    • NeilBrown's avatar
      nfsd: make nfsd_stats.th_cnt atomic_t · 9b6c8c9b
      NeilBrown authored
      This allows us to move the updates for th_cnt out of the mutex.
      This is a step towards reducing mutex coverage in nfsd().
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      9b6c8c9b
    • NeilBrown's avatar
      SUNRPC: stop using ->sv_nrthreads as a refcount · ec52361d
      NeilBrown authored
      The use of sv_nrthreads as a general refcount results in clumsy code, as
      is seen by various comments needed to explain the situation.
      
      This patch introduces a 'struct kref' and uses that for reference
      counting, leaving sv_nrthreads to be a pure count of threads.  The kref
      is managed particularly in svc_get() and svc_put(), and also nfsd_put();
      
      svc_destroy() now takes a pointer to the embedded kref, rather than to
      the serv.
      
      nfsd allows the svc_serv to exist with ->sv_nrhtreads being zero.  This
      happens when a transport is created before the first thread is started.
      To support this, a 'keep_active' flag is introduced which holds a ref on
      the svc_serv.  This is set when any listening socket is successfully
      added (unless there are running threads), and cleared when the number of
      threads is set.  So when the last thread exits, the nfs_serv will be
      destroyed.
      The use of 'keep_active' replaces previous code which checked if there
      were any permanent sockets.
      
      We no longer clear ->rq_server when nfsd() exits.  This was done
      to prevent svc_exit_thread() from calling svc_destroy().
      Instead we take an extra reference to the svc_serv to prevent
      svc_destroy() from being called.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      ec52361d
    • NeilBrown's avatar
      SUNRPC/NFSD: clean up get/put functions. · 8c62d127
      NeilBrown authored
      svc_destroy() is poorly named - it doesn't necessarily destroy the svc,
      it might just reduce the ref count.
      nfsd_destroy() is poorly named for the same reason.
      
      This patch:
       - removes the refcount functionality from svc_destroy(), moving it to
         a new svc_put().  Almost all previous callers of svc_destroy() now
         call svc_put().
       - renames nfsd_destroy() to nfsd_put() and improves the code, using
         the new svc_destroy() rather than svc_put()
       - removes a few comments that explain the important for balanced
         get/put calls.  This should be obvious.
      
      The only non-trivial part of this is that svc_destroy() would call
      svc_sock_update() on a non-final decrement.  It can no longer do that,
      and svc_put() isn't really a good place of it.  This call is now made
      from svc_exit_thread() which seems like a good place.  This makes the
      call *before* sv_nrthreads is decremented rather than after.  This
      is not particularly important as the call just sets a flag which
      causes sv_nrthreads set be checked later.  A subsequent patch will
      improve the ordering.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      8c62d127
    • NeilBrown's avatar
      SUNRPC: change svc_get() to return the svc. · df5e49c8
      NeilBrown authored
      It is common for 'get' functions to return the object that was 'got',
      and there are a couple of places where users of svc_get() would be a
      little simpler if svc_get() did that.
      
      Make it so.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      df5e49c8
    • NeilBrown's avatar
      NFSD: handle errors better in write_ports_addfd() · 89b24336
      NeilBrown authored
      If write_ports_add() fails, we shouldn't destroy the serv, unless we had
      only just created it.  So if there are any permanent sockets already
      attached, leave the serv in place.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      89b24336
    • Chuck Lever's avatar
      NFSD: Fix sparse warning · c2f1c4bd
      Chuck Lever authored
      /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24: warning: incorrect type in assignment (different base types)
      /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24:    expected restricted __be32 [usertype] status
      /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24:    got int
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      c2f1c4bd
  2. 12 Dec, 2021 14 commits
  3. 11 Dec, 2021 12 commits
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v5.16-2021-12-11' of... · bbdff6d5
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Prevent out-of-bounds access to per sample registers.
      
       - Fix NULL vs IS_ERR_OR_NULL() checking on the python binding.
      
       - Intel PT fixes, half of those are one-liners:
            - Fix some PGE (packet generation enable/control flow packets) usage.
            - Fix sync state when a PSB (synchronization) packet is found.
            - Fix intel_pt_fup_event() assumptions about setting state type.
            - Fix state setting when receiving overflow (OVF) packet.
            - Fix next 'err' value, walking trace.
            - Fix missing 'instruction' events with 'q' option.
            - Fix error timestamp setting on the decoder error path.
      
      * tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf python: Fix NULL vs IS_ERR_OR_NULL() checking
        perf intel-pt: Fix error timestamp setting on the decoder error path
        perf intel-pt: Fix missing 'instruction' events with 'q' option
        perf intel-pt: Fix next 'err' value, walking trace
        perf intel-pt: Fix state setting when receiving overflow (OVF) packet
        perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type
        perf intel-pt: Fix sync state when a PSB (synchronization) packet is found
        perf intel-pt: Fix some PGE (packet generation enable/control flow packets) usage
        perf tools: Prevent out-of-bounds access to registers
      bbdff6d5
    • Linus Torvalds's avatar
      Merge tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-block · eccea80b
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A few block fixes that should go into this release:
      
         - NVMe pull request:
              - set ana_log_size to 0 after freeing ana_log_buf (Hou Tao)
              - show subsys nqn for duplicate cntlids (Keith Busch)
              - disable namespace access for unsupported metadata (Keith
                Busch)
              - report write pointer for a full zone as zone start + zone len
                (Niklas Cassel)
              - fix use after free when disconnecting a reconnecting ctrl
                (Ruozhu Li)
              - fix a list corruption in nvmet-tcp (Sagi Grimberg)
      
         - Fix for a regression on DIO single bio async IO (Pavel)
      
         - ioprio seteuid fix (Davidlohr)
      
         - mtd fix that subsequently got reverted as it was broken, will get
           re-done and submitted for the next round
      
         - Two MD fixes via Song (Markus, zhangyue)"
      
      * tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-block:
        Revert "mtd_blkdevs: don't scan partitions for plain mtdblock"
        block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2)
        md: fix double free of mddev->private in autorun_array()
        md: fix update super 1.0 on rdev size change
        nvmet-tcp: fix possible list corruption for unexpected command failure
        block: fix single bio async DIO error handling
        nvme: fix use after free when disconnecting a reconnecting ctrl
        nvme-multipath: set ana_log_size to 0 after free ana_log_buf
        mtd_blkdevs: don't scan partitions for plain mtdblock
        nvme: report write pointer for a full zone as zone start + zone len
        nvme: disable namespace access for unsupported metadata
        nvme: show subsys nqn for duplicate cntlids
      eccea80b
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-block · f152165a
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "A few fixes that are all bound for stable:
      
         - Two syzbot reports for io-wq that turned out to be separate fixes,
           but ultimately very closely related
      
         - io_uring task_work running on cancelations"
      
      * tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-block:
        io-wq: check for wq exit after adding new worker task_work
        io_uring: ensure task_work gets run as part of cancelations
        io-wq: remove spurious bit clear on task_work addition
      f152165a
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · bd66be54
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Two more I2C driver bugfixes"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: mpc: Use atomic read and fix break condition
        i2c: virtio: fix completion handling
      bd66be54
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 2acdaf59
      Linus Torvalds authored
      Pull clk driver fixes from Stephen Boyd:
      
       - Fix qcom mux logic to look at the proper parent table member. Luckily
         this clk type isn't very common.
      
       - Don't kill clks on qcom systems that use Trion PLLs that are enabled
         out of the bootloader. We will simply skip programming the PLL rate
         if it's already done.
      
       - Use the proper clk_ops for the qcom sm6125 ICE clks.
      
       - Use module_platform_driver() in i.MX as it can be a module.
      
       - Fix a UAF in the versatile clk driver on an error path.
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: versatile: clk-icst: use after free on error path
        clk: qcom: sm6125-gcc: Swap ops of ice and apps on sdcc1
        clk: imx: use module_platform_driver
        clk: qcom: clk-alpha-pll: Don't reconfigure running Trion
        clk: qcom: regmap-mux: fix parent clock lookup
      2acdaf59
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · a84e0b31
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Revert schema checks on %.dtb targets. This was problematic for some
         external build tools.
      
       - A few DT binding example fixes
      
       - Add back dropped 'enet-phy-lane-no-swap' Ethernet PHY property
      
       - Drop erroneous if/then schema in nxp,imx7-mipi-csi2
      
       - Add a quirk to fix some interrupt controllers use of 'interrupt-map'
      
      * tag 'devicetree-fixes-for-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        Revert "kbuild: Enable DT schema checks for %.dtb targets"
        dt-bindings: bq25980: Fixup the example
        dt-bindings: input: gpio-keys: Fix interrupts in example
        dt-bindings: net: Reintroduce PHY no lane swap binding
        dt-bindings: media: nxp,imx7-mipi-csi2: Drop bad if/then schema
        of/irq: Add a quirk for controllers with their own definition of interrupt-map
        dt-bindings: iio: adc: exynos-adc: Fix node name in example
      a84e0b31
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · df442a4e
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "21 patches.
      
        Subsystems affected by this patch series: MAINTAINERS, mailmap, and mm
        (mlock, pagecache, damon, slub, memcg, hugetlb, and pagecache)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
        mm: bdi: initialize bdi_min_ratio when bdi is unregistered
        hugetlbfs: fix issue of preallocation of gigantic pages can't work
        mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()
        mm/slub: fix endianness bug for alloc/free_traces attributes
        selftests/damon: split test cases
        selftests/damon: test debugfs file reads/writes with huge count
        selftests/damon: test wrong DAMOS condition ranges input
        selftests/damon: test DAMON enabling with empty target_ids case
        selftests/damon: skip test if DAMON is running
        mm/damon/vaddr-test: remove unnecessary variables
        mm/damon/vaddr-test: split a test function having >1024 bytes frame size
        mm/damon/vaddr: remove an unnecessary warning message
        mm/damon/core: remove unnecessary error messages
        mm/damon/dbgfs: remove an unnecessary error message
        mm/damon/core: use better timer mechanisms selection threshold
        mm/damon/core: fix fake load reports due to uninterruptible sleeps
        timers: implement usleep_idle_range()
        filemap: remove PageHWPoison check from next_uptodate_page()
        mailmap: update email address for Guo Ren
        MAINTAINERS: update kdump maintainers
        ...
      df442a4e
    • Thomas Gleixner's avatar
      Merge tag 'timers-v5.16-rc4' of... · aa073d8b
      Thomas Gleixner authored
      Merge tag 'timers-v5.16-rc4' of https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent
      
      Pull timer fixes from Daniel Lezcano:
      
        - Fix build error with clang and some kernel configuration on the
          arm64 architected timer by inlining the
          erratum_set_next_event_generic() function (Marc Zyngier)
      
        - Fix probe error on the dw_apb_timer_of driver by fixing the
          incorrect condition previously introduced (Alexey Sheplyakov)
      
      Link: https://lore.kernel.org/r/429b796d-9395-4ca8-81f3-30911f80a9a9@linaro.org
      aa073d8b
    • Miaoqian Lin's avatar
      perf python: Fix NULL vs IS_ERR_OR_NULL() checking · 9937e8da
      Miaoqian Lin authored
      The function trace_event__tp_format_id may return ERR_PTR(-ENOMEM).  Use
      IS_ERR_OR_NULL to check tp_format.
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Link: http://lore.kernel.org/lkml/20211211053856.19827-1-linmq006@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9937e8da
    • Adrian Hunter's avatar
      perf intel-pt: Fix error timestamp setting on the decoder error path · 6665b8e4
      Adrian Hunter authored
      An error timestamp shows the last known timestamp for the queue, but this
      is not updated on the error path. Fix by setting it.
      
      Fixes: f4aa0819 ("perf tools: Add Intel PT decoder")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org # v5.15+
      Link: https://lore.kernel.org/r/20211210162303.2288710-8-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6665b8e4
    • Adrian Hunter's avatar
      perf intel-pt: Fix missing 'instruction' events with 'q' option · a882cc94
      Adrian Hunter authored
      FUP packets contain IP information, which makes them also an 'instruction'
      event in 'hop' mode i.e. the itrace 'q' option.  That wasn't happening, so
      restructure the logic so that FUP events are added along with appropriate
      'instruction' and 'branch' events.
      
      Fixes: 7c1b16ba ("perf intel-pt: Add support for decoding FUP/TIP only")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org # v5.15+
      Link: https://lore.kernel.org/r/20211210162303.2288710-7-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a882cc94
    • Adrian Hunter's avatar
      perf intel-pt: Fix next 'err' value, walking trace · a32e6c5d
      Adrian Hunter authored
      Code after label 'next:' in intel_pt_walk_trace() assumes 'err' is zero,
      but it may not be, if arrived at via a 'goto'. Ensure it is zero.
      
      Fixes: 7c1b16ba ("perf intel-pt: Add support for decoding FUP/TIP only")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: stable@vger.kernel.org # v5.15+
      Link: https://lore.kernel.org/r/20211210162303.2288710-6-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a32e6c5d