1. 24 Nov, 2018 1 commit
    • Linus Torvalds's avatar
      Merge tag 'xfs-4.20-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · abe72ff4
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "Dave and I have continued our work fixing corruption problems that can
        be found when running long-term burn-in exercisers on xfs. Here are
        some patches fixing most of the problems, but there will likely be
        more. :/
      
         - Numerous corruption fixes for copy on write
      
         - Numerous corruption fixes for blocksize < pagesize writes
      
         - Don't miscalculate AG reservations for small final AGs
      
         - Fix page cache truncation to work properly for reflink and extent
           shifting
      
         - Fix use-after-free when retrying failed inode/dquot buffer logging
      
         - Fix corruptions seen when using copy_file_range in directio mode"
      
      * tag 'xfs-4.20-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        iomap: readpages doesn't zero page tail beyond EOF
        vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP
        iomap: dio data corruption and spurious errors when pipes fill
        iomap: sub-block dio needs to zeroout beyond EOF
        iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents
        xfs: delalloc -> unwritten COW fork allocation can go wrong
        xfs: flush removing page cache in xfs_reflink_remap_prep
        xfs: extent shifting doesn't fully invalidate page cache
        xfs: finobt AG reserves don't consider last AG can be a runt
        xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers
        xfs: uncached buffer tracing needs to print bno
        xfs: make xfs_file_remap_range() static
        xfs: fix shared extent data corruption due to missing cow reservation
      abe72ff4
  2. 23 Nov, 2018 11 commits
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-4.20-rc4' of https://github.com/ceph/ceph-client · 7c98a426
      Linus Torvalds authored
      Pullk ceph fix from Ilya Dryomov:
       "A messenger fix, marked for stable"
      
      * tag 'ceph-for-4.20-rc4' of https://github.com/ceph/ceph-client:
        libceph: fall back to sendmsg for slab pages
      7c98a426
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20181123' of git://git.kernel.dk/linux-block · 3381918f
      Linus Torvalds authored
      Pull block fix from Jens Axboe:
       "Just a single fix for this week, fixing an issue with nvme-fc"
      
      * tag 'for-linus-20181123' of git://git.kernel.dk/linux-block:
        nvme-fc: resolve io failures during connect
      3381918f
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · d88783b9
      Linus Torvalds authored
      Pull IOMMU fixes from Joerg Roedel:
      
       - Two fixes for the Intel VT-d driver to fix a NULL-ptr dereference and
         an unbalance in an allocate/free path (allocated with memremap, freed
         with iounmap)
      
       - Fix for a crash in the Renesas IOMMU driver
      
       - Fix for the Advanced Virtual Interrupt Controler (AVIC) code in the
         AMD IOMMU driver
      
      * tag 'iommu-fixes-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/vt-d: Use memunmap to free memremap
        amd/iommu: Fix Guest Virtual APIC Log Tail Address Register
        iommu/ipmmu-vmsa: Fix crash on early domain free
        iommu/vt-d: Fix NULL pointer dereference in prq_event_thread()
      d88783b9
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · a03bac58
      Linus Torvalds authored
      Pull ACPI fix from Rafael Wysocki:
       "Prevent the ACPI core from registering a platform device for the
        SMB0001 HID to avoid IRQ allocation issues (Hans de Goede)"
      
      * tag 'acpi-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI / platform: Add SMB0001 HID to forbidden_id_list
      a03bac58
    • Linus Torvalds's avatar
      Merge tag 'pm-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · b88af994
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix two issues in the Operating Performance Points (OPP)
        framework, one cpufreq driver issue, one problem related to the tasks
        freezer and a few build-related issues in the cpupower utility.
      
        Specifics:
      
         - Fix tasks freezer deadlock in de_thread() that occurs if one of its
           sub-threads has been frozen already (Chanho Min).
      
         - Avoid registering a platform device by the ti-cpufreq driver on
           platforms that cannot use it (Dave Gerlach).
      
         - Fix a mistake in the ti-opp-supply operating performance points
           (OPP) driver that caused an incorrect reference voltage to be used
           and make it adjust the minimum voltage dynamically to avoid hangs
           or crashes in some cases (Keerthy).
      
         - Fix issues related to compiler flags in the cpupower utility and
           correct a linking problem in it by renaming a file with a duplicate
           name (Jiri Olsa, Konstantin Khlebnikov)"
      
      * tag 'pm-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        exec: make de_thread() freezable
        cpufreq: ti-cpufreq: Only register platform_device when supported
        opp: ti-opp-supply: Correct the supply in _get_optimal_vdd_voltage call
        opp: ti-opp-supply: Dynamically update u_volt_min
        tools cpupower: Override CFLAGS assignments
        tools cpupower debug: Allow to use outside build flags
        tools/power/cpupower: fix compilation with STATIC=true
      b88af994
    • Linus Torvalds's avatar
      Merge tag 'gpio-v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · e6005d3c
      Linus Torvalds authored
      Pull GPIO fixes from Linus Walleij:
       "Minor stuff except the IDA leak which was kind of important to fix.
        Also new maintainers, yay.
      
         - Do not lose an IDA on the gpiochip register errorpath.
      
         - Fix the PXA non-pincontrol GPIO-using platforms.
      
         - Fix the direction on the mockup GPIO driver.
      
         - Add some MAINTAINERS stuff: Bartosz stepped up as GPIO
           co-maintainer, and Andy established an Intel git tree"
      
      * tag 'gpio-v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        MAINTAINERS: Do maintain Intel GPIO drivers via separate tree
        gpio: mockup: fix indicated direction
        gpio: pxa: fix legacy non pinctrl aware builds again
        gpio: don't free unallocated ida on gpiochip_add_data_with_key() error path
        MAINTAINERS: add myself as co-maintainer of gpiolib
      e6005d3c
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · dcd3aa31
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "MMC host:
      
         - sdhci-pci: Fixup card detect lookup
      
         - sdhci-pci: Workaround GLK firmware bug for tuning"
      
      * tag 'mmc-v4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: sdhci-pci: Workaround GLK firmware failing to restore the tuning value
        mmc: sdhci-pci: Try "cd" for card-detect lookup before using NULL
      dcd3aa31
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2018-11-23' of git://anongit.freedesktop.org/drm/drm · 9b7c880c
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Regular drm fixes:
      
        amdgpu:
         - Vega20 fixes
         - firmware loading fix
         - panel display fix
         - override fix
      
        i915:
         - Sandybridge lockup fix
         - fastboot DSI panel fix
         - GPU hang on Broxton
         - GPU reloc fixes on pineview/bearlake
      
        ast:
         - screen blurring fix
         - cursor appearance fix
      
        udmabuf:
         - mmap fix
      
        vc4:
         - NULL deref fix
         - async cursor update fix
      
        All seems pretty normal at this stage"
      
      * tag 'drm-fixes-2018-11-23' of git://anongit.freedesktop.org/drm/drm:
        drm/ast: fixed cursor may disappear sometimes
        drm/ast: change resolution may cause screen blurred
        drm/i915: Add rotation readout for plane initial config
        drm/i915: Force a LUT update in intel_initial_commit()
        drm/fb-helper: Blacklist writeback when adding connectors to fbdev
        drm/i915: Write GPU relocs harder with gen3
        drm/amdgpu: Enable HDP memory light sleep
        drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
        drm/amd/pp: handle negative values when reading OD
        drm/amdgpu: Add missing firmware entry for HAINAN
        drm/amd/powerplay: disable Vega20 DS related features
        drm/amdgpu: Fix oops when pp_funcs->switch_power_profile is unset
        drm/i915: Disable LP3 watermarks on all SNB machines
        drm/ast: Remove existing framebuffers before loading driver
        udmabuf: set read/write flag when exporting
        drm/amd/display: Support amdgpu "max bpc" connector property (v2)
        drm/amdgpu: Add amdgpu "max bpc" connector property (v2)
        drm/vc4: Set ->legacy_cursor_update to false when doing non-async updates
        drm/vc4: Fix NULL pointer dereference in the async update path
      9b7c880c
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-cpufreq' and 'pm-sleep' · 1d50088c
      Rafael J. Wysocki authored
      * pm-cpufreq:
        cpufreq: ti-cpufreq: Only register platform_device when supported
      
      * pm-sleep:
        exec: make de_thread() freezable
      1d50088c
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-opp' and 'pm-tools' · bec00cb5
      Rafael J. Wysocki authored
      * pm-opp:
        opp: ti-opp-supply: Correct the supply in _get_optimal_vdd_voltage call
        opp: ti-opp-supply: Dynamically update u_volt_min
      
      * pm-tools:
        tools cpupower: Override CFLAGS assignments
        tools cpupower debug: Allow to use outside build flags
        tools/power/cpupower: fix compilation with STATIC=true
      bec00cb5
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2018-11-22' of... · 98c9cdfd
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2018-11-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      - Fix for fastboot DSI panel boot time flicker regression, also fixes Bugzilla #108225
      - Fix Bugzilla #101269 to avoid GPU hangs on Sandybridge machines
      - Avoid GPU hang on error capture on Broxton with Vt-d enabled
      - Avoid missing GPU relocations on Pineview and Bearlake (Gen3)
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181122120555.GA18282@jlahtine-desk.ger.corp.intel.com
      98c9cdfd
  3. 22 Nov, 2018 10 commits
  4. 21 Nov, 2018 15 commits
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-4.20-rc4' of... · 92b41928
      Linus Torvalds authored
      Merge tag 'riscv-for-linus-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
      
      Pull RISC-V fixes from Palmer Dabbelt:
       "This week is a bit bigger than I expected. That's my fault, as I
        missed a few patches while I was at Plumbers last week. We have:
      
         - A fix to a quite embarassing issue where raw_copy_to_user() was
           implemented with asm_copy_from_user() (and vice versa).
      
         - Improvements to our makefile to allow flat binaries to be
           generated.
      
         - A build fix that predeclares "struct module" at the top of
           <asm/module.h>, which triggers warnings later in that header.
      
         - The addition of our own <uapi/asm/unistd> header, which is
           necessary to align our stat ABI on 32-bit systems.
      
         - A fix to avoid printing a warning when the S or U bits are set in
           print_isa().
      
        I already have one patch in the queue for next week"
      
      * tag 'riscv-for-linus-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
        RISC-V: recognize S/U mode bits in print_isa
        riscv: add asm/unistd.h UAPI header
        riscv: fix warning in arch/riscv/include/asm/module.h
        RISC-V: Build flat and compressed kernel images
        RISC-V: Fix raw_copy_{to,from}_user()
      92b41928
    • Dave Chinner's avatar
      iomap: readpages doesn't zero page tail beyond EOF · 8c110d43
      Dave Chinner authored
      When we read the EOF page of the file via readpages, we need
      to zero the region beyond EOF that we either do not read or
      should not contain data so that mmap does not expose stale data to
      user applications.
      
      However, iomap_adjust_read_range() fails to detect EOF correctly,
      and so fsx on 1k block size filesystems fails very quickly with
      mapreads exposing data beyond EOF. There are two problems here.
      
      Firstly, when calculating the end block of the EOF byte, we have
      to round the size by one to avoid a block aligned EOF from reporting
      a block too large. i.e. a size of 1024 bytes is 1 block, which in
      index terms is block 0. Therefore we have to calculate the end block
      from (isize - 1), not isize.
      
      The second bug is determining if the current page spans EOF, and so
      whether we need split it into two half, one for the IO, and the
      other for zeroing. Unfortunately, the code that checks whether
      we should split the block doesn't actually check if we span EOF, it
      just checks if the read spans the /offset in the page/ that EOF
      sits on. So it splits every read into two if EOF is not page
      aligned, regardless of whether we are reading the EOF block or not.
      
      Hence we need to restrict the "does the read span EOF" check to
      just the page that spans EOF, not every page we read.
      
      This patch results in correct EOF detection through readpages:
      
      xfs_vm_readpages:     dev 259:0 ino 0x43 nr_pages 24
      xfs_iomap_found:      dev 259:0 ino 0x43 size 0x66c00 offset 0x4f000 count 98304 type hole startoff 0x13c startblock 1368 blockcount 0x4
      iomap_readpage_actor: orig pos 323584 pos 323584, length 4096, poff 0 plen 4096, isize 420864
      xfs_iomap_found:      dev 259:0 ino 0x43 size 0x66c00 offset 0x50000 count 94208 type hole startoff 0x140 startblock 1497 blockcount 0x5c
      iomap_readpage_actor: orig pos 327680 pos 327680, length 94208, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 331776 pos 331776, length 90112, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 335872 pos 335872, length 86016, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 339968 pos 339968, length 81920, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 344064 pos 344064, length 77824, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 348160 pos 348160, length 73728, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 352256 pos 352256, length 69632, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 356352 pos 356352, length 65536, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 360448 pos 360448, length 61440, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 364544 pos 364544, length 57344, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 368640 pos 368640, length 53248, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 372736 pos 372736, length 49152, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 376832 pos 376832, length 45056, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 380928 pos 380928, length 40960, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 385024 pos 385024, length 36864, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 389120 pos 389120, length 32768, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 393216 pos 393216, length 28672, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 397312 pos 397312, length 24576, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 401408 pos 401408, length 20480, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 405504 pos 405504, length 16384, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 409600 pos 409600, length 12288, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 413696 pos 413696, length 8192, poff 0 plen 4096, isize 420864
      iomap_readpage_actor: orig pos 417792 pos 417792, length 4096, poff 0 plen 3072, isize 420864
      iomap_readpage_actor: orig pos 420864 pos 420864, length 1024, poff 3072 plen 1024, isize 420864
      
      As you can see, it now does full page reads until the last one which
      is split correctly at the block aligned EOF, reading 3072 bytes and
      zeroing the last 1024 bytes. The original version of the patch got
      this right, but it got another case wrong.
      
      The EOF detection crossing really needs to the the original length
      as plen, while it starts at the end of the block, will be shortened
      as up-to-date blocks are found on the page. This means "orig_pos +
      plen" no longer points to the end of the page, and so will not
      correctly detect EOF crossing. Hence we have to use the length
      passed in to detect this partial page case:
      
      xfs_filemap_fault:    dev 259:1 ino 0x43  write_fault 0
      xfs_vm_readpage:      dev 259:1 ino 0x43 nr_pages 1
      xfs_iomap_found:      dev 259:1 ino 0x43 size 0x2cc00 offset 0x2c000 count 4096 type hole startoff 0xb0 startblock 282 blockcount 0x4
      iomap_readpage_actor: orig pos 180224 pos 181248, length 4096, poff 1024 plen 2048, isize 183296
      xfs_iomap_found:      dev 259:1 ino 0x43 size 0x2cc00 offset 0x2cc00 count 1024 type hole startoff 0xb3 startblock 285 blockcount 0x1
      iomap_readpage_actor: orig pos 183296 pos 183296, length 1024, poff 3072 plen 1024, isize 183296
      
      Heere we see a trace where the first block on the EOF page is up to
      date, hence poff = 1024 bytes. The offset into the page of EOF is
      3072, so the range we want to read is 1024 - 3071, and the range we
      want to zero is 3072 - 4095. You can see this is split correctly
      now.
      
      This fixes the stale data beyond EOF problem that fsx quickly
      uncovers on 1k block size filesystems.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      8c110d43
    • Dave Chinner's avatar
      vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP · 494633fa
      Dave Chinner authored
      It returns EINVAL when the operation is not supported by the
      filesystem. Fix it to return EOPNOTSUPP to be consistent with
      the man page and clone_file_range().
      
      Clean up the inconsistent error return handling while I'm there.
      (I know, lipstick on a pig, but every little bit helps...)
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      494633fa
    • Dave Chinner's avatar
      iomap: dio data corruption and spurious errors when pipes fill · 4721a601
      Dave Chinner authored
      When doing direct IO to a pipe for do_splice_direct(), then pipe is
      trivial to fill up and overflow as it can only hold 16 pages. At
      this point bio_iov_iter_get_pages() then returns -EFAULT, and we
      abort the IO submission process. Unfortunately, iomap_dio_rw()
      propagates the error back up the stack.
      
      The error is converted from the EFAULT to EAGAIN in
      generic_file_splice_read() to tell the splice layers that the pipe
      is full. do_splice_direct() completely fails to handle EAGAIN errors
      (it aborts on error) and returns EAGAIN to the caller.
      
      copy_file_write() then completely fails to handle EAGAIN as well,
      and so returns EAGAIN to userspace, having failed to copy the data
      it was asked to.
      
      Avoid this whole steaming pile of fail by having iomap_dio_rw()
      silently swallow EFAULT errors and so do short reads.
      
      To make matters worse, iomap_dio_actor() has a stale data exposure
      bug bio_iov_iter_get_pages() fails - it does not zero the tail block
      that it may have been left uncovered by partial IO. Fix the error
      handling case to drop to the sub-block zeroing rather than
      immmediately returning the -EFAULT error.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      4721a601
    • Dave Chinner's avatar
      iomap: sub-block dio needs to zeroout beyond EOF · b450672f
      Dave Chinner authored
      If we are doing sub-block dio that extends EOF, we need to zero
      the unused tail of the block to initialise the data in it it. If we
      do not zero the tail of the block, then an immediate mmap read of
      the EOF block will expose stale data beyond EOF to userspace. Found
      with fsx running sub-block DIO sizes vs MAPREAD/MAPWRITE operations.
      
      Fix this by detecting if the end of the DIO write is beyond EOF
      and zeroing the tail if necessary.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      b450672f
    • Dave Chinner's avatar
      iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents · 0929d858
      Dave Chinner authored
      When we write into an unwritten extent via direct IO, we dirty
      metadata on IO completion to convert the unwritten extent to
      written. However, when we do the FUA optimisation checks, the inode
      may be clean and so we issue a FUA write into the unwritten extent.
      This means we then bypass the generic_write_sync() call after
      unwritten extent conversion has ben done and we don't force the
      modified metadata to stable storage.
      
      This violates O_DSYNC semantics. The window of exposure is a single
      IO, as the next DIO write will see the inode has dirty metadata and
      hence will not use the FUA optimisation. Calling
      generic_write_sync() after completion of the second IO will also
      sync the first write and it's metadata.
      
      Fix this by avoiding the FUA optimisation when writing to unwritten
      extents.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      0929d858
    • Dave Chinner's avatar
      xfs: delalloc -> unwritten COW fork allocation can go wrong · 9230a0b6
      Dave Chinner authored
      Long saga. There have been days spent following this through dead end
      after dead end in multi-GB event traces. This morning, after writing
      a trace-cmd wrapper that enabled me to be more selective about XFS
      trace points, I discovered that I could get just enough essential
      tracepoints enabled that there was a 50:50 chance the fsx config
      would fail at ~115k ops. If it didn't fail at op 115547, I stopped
      fsx at op 115548 anyway.
      
      That gave me two traces - one where the problem manifested, and one
      where it didn't. After refining the traces to have the necessary
      information, I found that in the failing case there was a real
      extent in the COW fork compared to an unwritten extent in the
      working case.
      
      Walking back through the two traces to the point where the CWO fork
      extents actually diverged, I found that the bad case had an extra
      unwritten extent in it. This is likely because the bug it led me to
      had triggered multiple times in those 115k ops, leaving stray
      COW extents around. What I saw was a COW delalloc conversion to an
      unwritten extent (as they should always be through
      xfs_iomap_write_allocate()) resulted in a /written extent/:
      
      xfs_writepage:        dev 259:0 ino 0x83 pgoff 0x17000 size 0x79a00 offset 0 length 0
      xfs_iext_remove:      dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/2 offset 32 block 152 count 20 flag 1 caller xfs_bmap_add_extent_delay_real
      xfs_bmap_pre_update:  dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/1 offset 1 block 4503599627239429 count 31 flag 0 caller xfs_bmap_add_extent_delay_real
      xfs_bmap_post_update: dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/1 offset 1 block 121 count 51 flag 0 caller xfs_bmap_add_ex
      
      Basically, Cow fork before:
      
      	0 1            32          52
      	+H+DDDDDDDDDDDD+UUUUUUUUUUU+
      	   PREV		RIGHT
      
      COW delalloc conversion allocates:
      
      	  1	       32
      	  +uuuuuuuuuuuu+
      	  NEW
      
      And the result according to the xfs_bmap_post_update trace was:
      
      	0 1            32          52
      	+H+wwwwwwwwwwwwwwwwwwwwwwww+
      	   PREV
      
      Which is clearly wrong - it should be a merged unwritten extent,
      not an unwritten extent.
      
      That lead me to look at the LEFT_FILLING|RIGHT_FILLING|RIGHT_CONTIG
      case in xfs_bmap_add_extent_delay_real(), and sure enough, there's
      the bug.
      
      It takes the old delalloc extent (PREV) and adds the length of the
      RIGHT extent to it, takes the start block from NEW, removes the
      RIGHT extent and then updates PREV with the new extent.
      
      What it fails to do is update PREV.br_state. For delalloc, this is
      always XFS_EXT_NORM, while in this case we are converting the
      delayed allocation to unwritten, so it needs to be updated to
      XFS_EXT_UNWRITTEN. This LF|RF|RC case does not do this, and so
      the resultant extent is always written.
      
      And that's the bug I've been chasing for a week - a bmap btree bug,
      not a reflink/dedupe/copy_file_range bug, but a BMBT bug introduced
      with the recent in core extent tree scalability enhancements.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      9230a0b6
    • Dave Chinner's avatar
      xfs: flush removing page cache in xfs_reflink_remap_prep · 2c307174
      Dave Chinner authored
      On a sub-page block size filesystem, fsx is failing with a data
      corruption after a series of operations involving copying a file
      with the destination offset beyond EOF of the destination of the file:
      
      8093(157 mod 256): TRUNCATE DOWN        from 0x7a120 to 0x50000 ******WWWW
      8094(158 mod 256): INSERT 0x25000 thru 0x25fff  (0x1000 bytes)
      8095(159 mod 256): COPY 0x18000 thru 0x1afff    (0x3000 bytes) to 0x2f400
      8096(160 mod 256): WRITE    0x5da00 thru 0x651ff        (0x7800 bytes) HOLE
      8097(161 mod 256): COPY 0x2000 thru 0x5fff      (0x4000 bytes) to 0x6fc00
      
      The second copy here is beyond EOF, and it is to sub-page (4k) but
      block aligned (1k) offset. The clone runs the EOF zeroing, landing
      in a pre-existing post-eof delalloc extent. This zeroes the post-eof
      extents in the page cache just fine, dirtying the pages correctly.
      
      The problem is that xfs_reflink_remap_prep() now truncates the page
      cache over the range that it is copying it to, and rounds that down
      to cover the entire start page. This removes the dirty page over the
      delalloc extent from the page cache without having written it back.
      Hence later, when the page cache is flushed, the page at offset
      0x6f000 has not been written back and hence exposes stale data,
      which fsx trips over less than 10 operations later.
      
      Fix this by changing xfs_reflink_remap_prep() to use
      xfs_flush_unmap_range().
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      2c307174
    • Jens Axboe's avatar
      Merge branch 'nvme-4.20' of git://git.infradead.org/nvme into for-linus · 14b04063
      Jens Axboe authored
      Pull NVMe fix from Christoph.
      
      * 'nvme-4.20' of git://git.infradead.org/nvme:
        nvme-fc: resolve io failures during connect
      14b04063
    • Rafael J. Wysocki's avatar
      Merge tag 'linux-cpupower-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux · 0db699f7
      Rafael J. Wysocki authored
      Pull cpupower utility updates for 4.20-rc4 from Shuah Khan:
      
      "This cpupower update for Linux 4.20-rc4 consists of compile fixes to allow
       use of outside build flags and override of CFLAGS from Jiri Olsa, and fix
       to compilation with STATIC=true from Konstantin Khlebnikov."
      
      * tag 'linux-cpupower-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
        tools cpupower: Override CFLAGS assignments
        tools cpupower debug: Allow to use outside build flags
        tools/power/cpupower: fix compilation with STATIC=true
      0db699f7
    • Ville Syrjälä's avatar
      drm/i915: Add rotation readout for plane initial config · f559156c
      Ville Syrjälä authored
      If we need to force a full plane update before userspace/fbdev
      have given us a proper plane state we should try to maintain the
      current plane state as much as possible (apart from the parts
      of the state we're trying to fix up with the plane update).
      To that end add basic readout for the plane rotation and
      maintain it during the initial fb takeover.
      
      Cc: Hans de Goede <hdegoede@redhat.com>
      Fixes: 516a49cc ("drm/i915: Fix assert_plane() warning on bootup with external display")
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181120135450.3634-2-ville.syrjala@linux.intel.comTested-by: default avatarHans de Goede <hdegoede@redhat.com>
      Reviewed-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      (cherry picked from commit f43348a3)
      Signed-off-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      f559156c
    • Ville Syrjälä's avatar
      drm/i915: Force a LUT update in intel_initial_commit() · c773058d
      Ville Syrjälä authored
      If we force a plane update to fix up our half populated plane state
      we'll also force on the pipe gamma for the plane (since we always
      enable pipe gamma currently). If the BIOS hasn't programmed a sensible
      LUT into the hardware this will cause the image to become corrupted.
      Typical symptoms are a purple/yellow/etc. flash when the driver loads.
      
      To avoid this let's program something sensible into the LUT when
      we do the plane update. In the future I plan to add proper plane
      gamma enable readout so this is just a temporary measure.
      
      Cc: Hans de Goede <hdegoede@redhat.com>
      Fixes: 516a49cc ("drm/i915: Fix assert_plane() warning on bootup with external display")
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181120135450.3634-1-ville.syrjala@linux.intel.comTested-by: default avatarHans de Goede <hdegoede@redhat.com>
      Reviewed-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      (cherry picked from commit fa6af514)
      Signed-off-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      c773058d
    • Hans de Goede's avatar
      ACPI / platform: Add SMB0001 HID to forbidden_id_list · 2bbb5fa3
      Hans de Goede authored
      Many HP AMD based laptops contain an SMB0001 device like this:
      
      Device (SMBD)
      {
          Name (_HID, "SMB0001")  // _HID: Hardware ID
          Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
          {
              IO (Decode16,
                  0x0B20,             // Range Minimum
                  0x0B20,             // Range Maximum
                  0x20,               // Alignment
                  0x20,               // Length
                  )
              IRQ (Level, ActiveLow, Shared, )
                  {7}
          })
      }
      
      The legacy style IRQ resource here causes acpi_dev_get_irqresource() to
      be called with legacy=true and this message to show in dmesg:
      ACPI: IRQ 7 override to edge, high
      
      This causes issues when later on the AMD0030 GPIO device gets enumerated:
      
      Device (GPIO)
      {
          Name (_HID, "AMDI0030")  // _HID: Hardware ID
          Name (_CID, "AMDI0030")  // _CID: Compatible ID
          Name (_UID, Zero)  // _UID: Unique ID
          Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
          {
      	Name (RBUF, ResourceTemplate ()
      	{
      	    Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, )
      	    {
      		0x00000007,
      	    }
      	    Memory32Fixed (ReadWrite,
      		0xFED81500,         // Address Base
      		0x00000400,         // Address Length
      		)
      	})
      	Return (RBUF) /* \_SB_.GPIO._CRS.RBUF */
          }
      }
      
      Now acpi_dev_get_irqresource() gets called with legacy=false, but because
      of the earlier override of the trigger-type acpi_register_gsi() returns
      -EBUSY (because we try to register the same interrupt with a different
      trigger-type) and we end up setting IORESOURCE_DISABLED in the flags.
      
      The setting of IORESOURCE_DISABLED causes platform_get_irq() to call
      acpi_irq_get() which is not implemented on x86 and returns -EINVAL.
      resulting in the following in dmesg:
      
      amd_gpio AMDI0030:00: Failed to get gpio IRQ: -22
      amd_gpio: probe of AMDI0030:00 failed with error -22
      
      The SMB0001 is a "virtual" device in the sense that the only way the OS
      interacts with it is through calling a couple of methods to do SMBus
      transfers. As such it is weird that it has IO and IRQ resources at all,
      because the driver for it is not expected to ever access the hardware
      directly.
      
      The Linux driver for the SMB0001 device directly binds to the acpi_device
      through the acpi_bus, so we do not need to instantiate a platform_device
      for this ACPI device. This commit adds the SMB0001 HID to the
      forbidden_id_list, avoiding the instantiating of a platform_device for it.
      Not instantiating a platform_device means we will no longer call
      acpi_dev_get_irqresource() for the legacy IRQ resource fixing the probe of
      the AMDI0030 device failing.
      
      BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1644013
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198715
      BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199523Reported-by: default avatarLukas Kahnert <openproggerfreak@gmail.com>
      Tested-by: default avatarMarc <suaefar@googlemail.com>
      Cc: All applicable <stable@vger.kernel.org>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2bbb5fa3
    • Paul Kocialkowski's avatar
      drm/fb-helper: Blacklist writeback when adding connectors to fbdev · 8fd3b903
      Paul Kocialkowski authored
      Writeback connectors do not produce any on-screen output and require
      special care for use. Such connectors are hidden from enumeration in
      DRM resources by default, but they are still picked-up by fbdev.
      This makes rather little sense since fbdev is not really adapted for
      dealing with writeback.
      
      Moreover, this is also a source of issues when userspace disables the
      CRTC (and associated plane) without detaching the CRTC from the
      connector (which is hidden by default). In this case, the connector is
      still using the CRTC, leading to am "enabled/connectors mismatch" and
      eventually the failure of the associated atomic commit. This situation
      happens with VC4 testing under IGT GPU Tools.
      
      Filter out writeback connectors in the fbdev helper to solve this.
      Signed-off-by: default avatarPaul Kocialkowski <paul.kocialkowski@bootlin.com>
      Reviewed-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Reviewed-by: default avatarMaxime Ripard <maxime.ripard@bootlin.com>
      Tested-by: default avatarMaxime Ripard <maxime.ripard@bootlin.com>
      Fixes: 935774cd ("drm: Add writeback connector type")
      Cc: <stable@vger.kernel.org> # v4.19+
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181115163248.21168-1-paul.kocialkowski@bootlin.com
      8fd3b903
    • Chris Wilson's avatar
      drm/i915: Write GPU relocs harder with gen3 · f8577fb3
      Chris Wilson authored
      Under moderate amounts of GPU stress, we can observe on Bearlake and
      Pineview (later gen3 models) that we execute the following batch buffer
      before the write into the batch is coherent. Adding extra (tested with
      upto 32x) MI_FLUSH to either the invalidation, flush or both phases does
      not solve the incoherency issue with the relocations, but emitting the
      MI_STORE_DWORD_IMM twice does. So be it.
      
      Fixes: 7dd4f672 ("drm/i915: Async GPU relocation processing")
      Testcase: igt/gem_tiled_fence_blits # blb/pnv
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181119154153.15327-1-chris@chris-wilson.co.uk
      (cherry picked from commit 7fa28e14)
      Signed-off-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      f8577fb3
  5. 20 Nov, 2018 3 commits