1. 31 Jan, 2019 11 commits
    • Mathieu Malaterre's avatar
      drivers: base: Use __printf markup to silence compiler · fa548d79
      Mathieu Malaterre authored
      Silence warnings (triggered at W=1) by adding relevant __printf
      attributes.
      
        drivers/base/cpu.c:432:2: warning: function '__cpu_device_create' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
      Signed-off-by: default avatarMathieu Malaterre <malat@debian.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa548d79
    • Richard Gong's avatar
      firmware: intel_stratix10_service: add hardware dependency · 095ff29d
      Richard Gong authored
      Add a Kconfig dependency to ensure Intel Stratix10 service layer driver
      can be built only on the platform that supports it.
      Signed-off-by: default avatarRichard Gong <richard.gong@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      095ff29d
    • Alexander Duyck's avatar
      driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity · 57ea974f
      Alexander Duyck authored
      The current async_probe test code is only testing one device allocated
      prior to driver load and only loading one device afterwards. Instead of
      doing things this way it makes much more sense to load one device per CPU
      in order to actually stress the async infrastructure. By doing this we
      should see delays significantly increase in the event of devices being
      serialized.
      
      In addition I have updated the test to verify that we are trying to place
      the work on the correct NUMA node when we are running in async mode. By
      doing this we can verify the best possible outcome for device and driver
      load times.
      
      I have added a timeout value that is used to disable the sleep and instead
      cause the probe routine to report an error indicating it timed out. By
      doing this we limit the maximum runtime for the test to 20 seconds or less.
      
      The last major change in this set is that I have gone through and tuned it
      for handling the massive number of possible events that will be scheduled.
      Instead of reporting the sleep for each individual device it is moved to
      only being displayed if we enable debugging.
      
      With this patch applied below are what a failing test and a passing test
      should look like. I elided a few hundred lines in the failing test that
      were duplicated since the system I was testing on had a massive number of
      CPU cores:
      
      -- Failing --
      [  243.524697] test_async_driver_probe: registering first set of asynchronous devices...
      [  243.535625] test_async_driver_probe: registering asynchronous driver...
      [  243.543038] test_async_driver_probe: registration took 0 msecs
      [  243.549559] test_async_driver_probe: registering second set of asynchronous devices...
      [  243.568350] platform test_async_driver.447: registration took 9 msecs
      [  243.575544] test_async_driver_probe: registering first synchronous device...
      [  243.583454] test_async_driver_probe: registering synchronous driver...
      [  248.825920] test_async_driver_probe: registration took 5235 msecs
      [  248.825922] test_async_driver_probe: registering second synchronous device...
      [  248.825928] test_async_driver test_async_driver.443: NUMA node mismatch 3 != 1
      [  248.825932] test_async_driver test_async_driver.445: NUMA node mismatch 3 != 1
      [  248.825935] test_async_driver test_async_driver.446: NUMA node mismatch 3 != 1
      [  248.825939] test_async_driver test_async_driver.440: NUMA node mismatch 3 != 1
      [  248.825943] test_async_driver test_async_driver.441: NUMA node mismatch 3 != 1
      ...
      [  248.827150] test_async_driver test_async_driver.229: NUMA node mismatch 0 != 1
      [  248.827158] test_async_driver test_async_driver.228: NUMA node mismatch 0 != 1
      [  248.827220] test_async_driver test_async_driver.281: NUMA node mismatch 2 != 1
      [  248.827229] test_async_driver test_async_driver.282: NUMA node mismatch 2 != 1
      [  248.827240] test_async_driver test_async_driver.280: NUMA node mismatch 2 != 1
      [  253.945834] test_async_driver test_async_driver.1: NUMA node mismatch 0 != 1
      [  253.945878] test_sync_driver test_sync_driver.1: registration took 5119 msecs
      [  253.961693] test_async_driver_probe: async events still pending, forcing timeout and synchronize
      [  259.065839] test_async_driver test_async_driver.2: NUMA node mismatch 0 != 1
      [  259.073786] test_async_driver test_async_driver.3: async probe took too long
      [  259.081669] test_async_driver test_async_driver.3: NUMA node mismatch 0 != 1
      [  259.089569] test_async_driver test_async_driver.4: async probe took too long
      [  259.097451] test_async_driver test_async_driver.4: NUMA node mismatch 0 != 1
      [  259.105338] test_async_driver test_async_driver.5: async probe took too long
      [  259.113204] test_async_driver test_async_driver.5: NUMA node mismatch 0 != 1
      [  259.121089] test_async_driver test_async_driver.6: async probe took too long
      [  259.128961] test_async_driver test_async_driver.6: NUMA node mismatch 0 != 1
      [  259.136850] test_async_driver test_async_driver.7: async probe took too long
      ...
      [  262.124062] test_async_driver test_async_driver.221: async probe took too long
      [  262.132130] test_async_driver test_async_driver.221: NUMA node mismatch 3 != 1
      [  262.140206] test_async_driver test_async_driver.222: async probe took too long
      [  262.148277] test_async_driver test_async_driver.222: NUMA node mismatch 3 != 1
      [  262.156351] test_async_driver test_async_driver.223: async probe took too long
      [  262.164419] test_async_driver test_async_driver.223: NUMA node mismatch 3 != 1
      [  262.172630] test_async_driver_probe: Test failed with 222 errors and 336 warnings
      
      -- Passing --
      [  105.419247] test_async_driver_probe: registering first set of asynchronous devices...
      [  105.432040] test_async_driver_probe: registering asynchronous driver...
      [  105.439718] test_async_driver_probe: registration took 0 msecs
      [  105.446239] test_async_driver_probe: registering second set of asynchronous devices...
      [  105.477986] platform test_async_driver.447: registration took 22 msecs
      [  105.485276] test_async_driver_probe: registering first synchronous device...
      [  105.493169] test_async_driver_probe: registering synchronous driver...
      [  110.597981] test_async_driver_probe: registration took 5097 msecs
      [  110.604806] test_async_driver_probe: registering second synchronous device...
      [  115.707490] test_sync_driver test_sync_driver.1: registration took 5094 msecs
      [  115.715478] test_async_driver_probe: completed successfully
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      57ea974f
    • Alexander Duyck's avatar
      libnvdimm: Schedule device registration on node local to the device · af87b9a7
      Alexander Duyck authored
      Force the device registration for nvdimm devices to be closer to the actual
      device. This is achieved by using either the NUMA node ID of the region, or
      of the parent. By doing this we can have everything above the region based
      on the region, and everything below the region based on the nvdimm bus.
      
      By guaranteeing NUMA locality I see an improvement of as high as 25% for
      per-node init of a system with 12TB of persistent memory.
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af87b9a7
    • Alexander Duyck's avatar
      PM core: Use new async_schedule_dev command · 8b9ec6b7
      Alexander Duyck authored
      Use the device specific version of the async_schedule commands to defer
      various tasks related to power management. By doing this we should see a
      slight improvement in performance as any device that is sensitive to
      latency/locality in the setup will now be initializing on the node closest
      to the device.
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8b9ec6b7
    • Alexander Duyck's avatar
      driver core: Attach devices on CPU local to device node · c37e20ea
      Alexander Duyck authored
      Call the asynchronous probe routines on a CPU local to the device node. By
      doing this we should be able to improve our initialization time
      significantly as we can avoid having to access the device from a remote
      node which may introduce higher latency.
      
      For example, in the case of initializing memory for NVDIMM this can have a
      significant impact as initialing 3TB on remote node can take up to 39
      seconds while initialing it on a local node only takes 23 seconds. It is
      situations like this where we will see the biggest improvement.
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Reviewed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c37e20ea
    • Alexander Duyck's avatar
      async: Add support for queueing on specific NUMA node · 6be9238e
      Alexander Duyck authored
      Introduce four new variants of the async_schedule_ functions that allow
      scheduling on a specific NUMA node.
      
      The first two functions are async_schedule_near and
      async_schedule_near_domain end up mapping to async_schedule and
      async_schedule_domain, but provide NUMA node specific functionality. They
      replace the original functions which were moved to inline function
      definitions that call the new functions while passing NUMA_NO_NODE.
      
      The second two functions are async_schedule_dev and
      async_schedule_dev_domain which provide NUMA specific functionality when
      passing a device as the data member and that device has a NUMA node other
      than NUMA_NO_NODE.
      
      The main motivation behind this is to address the need to be able to
      schedule device specific init work on specific NUMA nodes in order to
      improve performance of memory initialization.
      
      I have seen a significant improvement in initialziation time for persistent
      memory as a result of this approach. In the case of 3TB of memory on a
      single node the initialization time in the worst case went from 36s down to
      about 26s for a 10s improvement. As such the data shows a general benefit
      for affinitizing the async work to the node local to the device.
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6be9238e
    • Alexander Duyck's avatar
      workqueue: Provide queue_work_node to queue work near a given NUMA node · 8204e0c1
      Alexander Duyck authored
      Provide a new function, queue_work_node, which is meant to schedule work on
      a "random" CPU of the requested NUMA node. The main motivation for this is
      to help assist asynchronous init to better improve boot times for devices
      that are local to a specific node.
      
      For now we just default to the first CPU that is in the intersection of the
      cpumask of the node and the online cpumask. The only exception is if the
      CPU is local to the node we will just use the current CPU. This should work
      for our purposes as we are currently only using this for unbound work so
      the CPU will be translated to a node anyway instead of being directly used.
      
      As we are only using the first CPU to represent the NUMA node for now I am
      limiting the scope of the function so that it can only be used with unbound
      workqueues.
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Acked-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8204e0c1
    • Alexander Duyck's avatar
      driver core: Probe devices asynchronously instead of the driver · ef0ff683
      Alexander Duyck authored
      Probe devices asynchronously instead of the driver. This results in us
      seeing the same behavior if the device is registered before the driver or
      after. This way we can avoid serializing the initialization should the
      driver not be loaded until after the devices have already been added.
      
      The motivation behind this is that if we have a set of devices that
      take a significant amount of time to load we can greatly reduce the time to
      load by processing them in parallel instead of one at a time. In addition,
      each device can exist on a different node so placing a single thread on one
      CPU to initialize all of the devices for a given driver can result in poor
      performance on a system with multiple nodes.
      
      This approach can reduce the time needed to scan SCSI LUNs significantly.
      The only way to realize that speedup is by enabling more concurrency which
      is what is achieved with this patch.
      
      To achieve this it was necessary to add a new member "async_driver" to the
      device_private structure to store the driver pointer while we wait on the
      deferred probe call.
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Reviewed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef0ff683
    • Alexander Duyck's avatar
      device core: Consolidate locking and unlocking of parent and device · ed88747c
      Alexander Duyck authored
      Try to consolidate all of the locking and unlocking of both the parent and
      device when attaching or removing a driver from a given device.
      
      To do that I first consolidated the lock pattern into two functions
      __device_driver_lock and __device_driver_unlock. After doing that I then
      created functions specific to attaching and detaching the driver while
      acquiring these locks. By doing this I was able to reduce the number of
      spots where we touch need_parent_lock from 12 down to 4.
      
      This patch should produce no functional changes, it is meant to be a code
      clean-up/consolidation only.
      Reviewed-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed88747c
    • Alexander Duyck's avatar
      driver core: Establish order of operations for device_add and device_del via bitflag · 3451a495
      Alexander Duyck authored
      Add an additional bit flag to the device_private struct named "dead".
      
      This additional flag provides a guarantee that when a device_del is
      executed on a given interface an async worker will not attempt to attach
      the driver following the earlier device_del call. Previously this
      guarantee was not present and could result in the device_del call
      attempting to remove a driver from an interface only to have the async
      worker attempt to probe the driver later when it finally completes the
      asynchronous probe call.
      
      One additional change added was that I pulled the check for dev->driver
      out of the __device_attach_driver call and instead placed it in the
      __device_attach_async_helper call. This was motivated by the fact that the
      only other caller of this, __device_attach, had already taken the
      device_lock() and checked for dev->driver. Instead of testing for this
      twice in this path it makes more sense to just consolidate the dev->dead
      and dev->driver checks together into one set of checks.
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3451a495
  2. 22 Jan, 2019 19 commits
  3. 18 Jan, 2019 2 commits
  4. 15 Jan, 2019 1 commit
  5. 13 Jan, 2019 7 commits
    • Linus Torvalds's avatar
      Linux 5.0-rc2 · 1c7fc5cb
      Linus Torvalds authored
      1c7fc5cb
    • Jonathan Neuschäfer's avatar
      kernel/sys.c: Clarify that UNAME26 does not generate unique versions anymore · b7285b42
      Jonathan Neuschäfer authored
      UNAME26 is a mechanism to report Linux's version as 2.6.x, for
      compatibility with old/broken software.  Due to the way it is
      implemented, it would have to be updated after 5.0, to keep the
      resulting versions unique.  Linus Torvalds argued:
      
       "Do we actually need this?
      
        I'd rather let it bitrot, and just let it return random versions. It
        will just start again at 2.4.60, won't it?
      
        Anybody who uses UNAME26 for a 5.x kernel might as well think it's
        still 4.x. The user space is so old that it can't possibly care about
        differences between 4.x and 5.x, can it?
      
        The only thing that matters is that it shows "2.4.<largeenough>",
        which it will do regardless"
      Signed-off-by: default avatarJonathan Neuschäfer <j.neuschaefer@gmx.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b7285b42
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · dbc3c09b
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A bigger batch than I anticipated this week, for two reasons:
      
         - Some fallout on Davinci from board file -> DTB conversion, that
           also includes a few longer-standing fixes (i.e. not recent
           regressions).
      
         - drivers/reset material that has been in linux-next for a while, but
           didn't get sent to us until now for a variety of reasons
           (maintainer out sick, holidays, etc). There's a functional
           dependency in there such that one platform (Altera's SoCFPGA) won't
           boot without one of the patches; instead of reverting the patch
           that got merged, I looked at this set and decided it was small
           enough that I'll pick it up anyway. If you disagree I can revisit
           with a smaller set.
      
        That being said, there's also a handful of the usual stuff:
      
         - Fix for a crash on Armada 7K/8K when the kernel touches
           PSCI-reserved memory
      
         - Fix for PCIe reset on Macchiatobin (Armada 8K development board,
           what this email is sent from in fact :)
      
         - Enable a few new-merged modules for Amlogic in arm64 defconfig
      
         - Error path fixes on Integrator
      
         - Build fix for Renesas and Qualcomm
      
         - Initialization fix for Renesas RZ/G2E
      
        .. plus a few more fixlets"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (28 commits)
        ARM: integrator: impd1: use struct_size() in devm_kzalloc()
        qcom-scm: Include <linux/err.h> header
        gpio: pl061: handle failed allocations
        ARM: dts: kirkwood: Fix polarity of GPIO fan lines
        arm64: dts: marvell: mcbin: fix PCIe reset signal
        arm64: dts: marvell: armada-ap806: reserve PSCI area
        ARM: dts: da850-lcdk: Correct the sound card name
        ARM: dts: da850-lcdk: Correct the audio codec regulators
        ARM: dts: da850-evm: Correct the sound card name
        ARM: dts: da850-evm: Correct the audio codec regulators
        ARM: davinci: omapl138-hawk: fix label names in GPIO lookup entries
        ARM: davinci: dm644x-evm: fix label names in GPIO lookup entries
        ARM: davinci: dm355-evm: fix label names in GPIO lookup entries
        ARM: davinci: da850-evm: fix label names in GPIO lookup entries
        ARM: davinci: da830-evm: fix label names in GPIO lookup entries
        arm64: defconfig: enable modules for amlogic s400 sound card
        reset: uniphier-glue: Add AHCI reset control support in glue layer
        dt-bindings: reset: uniphier: Add AHCI core reset description
        reset: uniphier-usb3: Rename to reset-uniphier-glue
        dt-bindings: reset: uniphier: Replace the expression of USB3 with generic peripherals
        ...
      dbc3c09b
    • Linus Torvalds's avatar
      Merge tag 'for-5.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 6b529fb0
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
      
       - two regression fixes in clone/dedupe ioctls, the generic check
         callback needs to lock extents properly and wait for io to avoid
         problems with writeback and relocation
      
       - fix deadlock when using free space tree due to block group creation
      
       - a recently added check refuses a valid fileystem with seeding device,
         make that work again with a quickfix, proper solution needs more
         intrusive changes
      
      * tag 'for-5.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: Use real device structure to verify dev extent
        Btrfs: fix deadlock when using free space tree due to block group creation
        Btrfs: fix race between reflink/dedupe and relocation
        Btrfs: fix race between cloning range ending at eof and writeback
      6b529fb0
    • Linus Torvalds's avatar
      Merge tag 'driver-core-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core · 72d657dd
      Linus Torvalds authored
      Pull driver core fixes from Greg KH:
       "Here is one small sysfs change, and a documentation update for 5.0-rc2
      
        The sysfs change moves from using BUG_ON to WARN_ON, as discussed in
        an email thread on lkml while trying to track down another driver bug.
        sysfs should not be crashing and preventing people from seeing where
        they went wrong. Now it properly recovers and warns the developer.
      
        The documentation update removes the use of BUS_ATTR() as the kernel
        is moving away from this to use the specific BUS_ATTR_RW() and friends
        instead. There are pending patches in all of the different subsystems
        to remove the last users of this macro, but for now, don't advertise
        it should be used anymore to keep new ones from being introduced.
      
        Both have been in linux-next with no reported issues"
      
      * tag 'driver-core-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        Documentation: driver core: remove use of BUS_ATTR
        sysfs: convert BUG_ON to WARN_ON
      72d657dd
    • Linus Torvalds's avatar
      Merge tag 'staging-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · f7c1038b
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are some small staging driver fixes for some reported issues.
      
        One reverts a patch that was made to the rtl8723bs driver that turned
        out to not be needed at all as it was a bug in clang. The others fix
        up some reported issues in the rtl8188eu driver and update the
        MAINTAINERS file to point to Larry for this driver so he can get the
        bug reports easier.
      
        All have been in linux-next with no reported issues"
      
      * tag 'staging-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        Revert "staging: rtl8723bs: Mark ACPI table declaration as used"
        staging: rtl8188eu: Fix module loading from tasklet for WEP encryption
        staging: rtl8188eu: Fix module loading from tasklet for CCMP encryption
        MAINTAINERS: Add entry for staging driver r8188eu
      f7c1038b
    • Linus Torvalds's avatar
      Merge tag 'tty-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 437e878a
      Linus Torvalds authored
      Pull tty/serial fixes from Greg KH:
       "Here are 2 tty and serial fixes for 5.0-rc2 that resolve some reported
        issues.
      
        The first is a simple serial driver fix for a regression that showed
        up in 5.0-rc1. The second one resolves a number of reported issues
        with the recent tty locking fixes that went into 5.0-rc1. Lots of
        people have tested the second one and say it resolves their issues.
      
        Both have been in linux-next with no reported issues"
      
      * tag 'tty-5.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        tty: Don't hold ldisc lock in tty_reopen() if ldisc present
        serial: lantiq: Do not swap register read/writes
      437e878a