1. 01 Aug, 2021 1 commit
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · aa660326
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "This contains a bunch of bug fixes in XFS.
      
        Dave and I have been busy the last couple of weeks to find and fix as
        many log recovery bugs as we can find; here are the results so far. Go
        fstests -g recoveryloop! ;)
      
         - Fix a number of coordination bugs relating to cache flushes for
           metadata writeback, cache flushes for multi-buffer log writes, and
           FUA writes for single-buffer log writes
      
         - Fix a bug with incorrect replay of attr3 blocks
      
         - Fix unnecessary stalls when flushing logs to disk
      
         - Fix spoofing problems when recovering realtime bitmap blocks"
      
      * tag 'xfs-5.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: prevent spoofing of rtbitmap blocks when recovering buffers
        xfs: limit iclog tail updates
        xfs: need to see iclog flags in tracing
        xfs: Enforce attr3 buffer recovery order
        xfs: logging the on disk inode LSN can make it go backwards
        xfs: avoid unnecessary waits in xfs_log_force_lsn()
        xfs: log forces imply data device cache flushes
        xfs: factor out forced iclog flushes
        xfs: fix ordering violation between cache flushes and tail updates
        xfs: fold __xlog_state_release_iclog into xlog_state_release_iclog
        xfs: external logs need to flush data device
        xfs: flush data dev on external log write
      aa660326
  2. 31 Jul, 2021 1 commit
  3. 30 Jul, 2021 29 commits
    • Linus Torvalds's avatar
      Merge tag 'net-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · c7d10223
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Networking fixes for 5.14-rc4, including fixes from bpf, can, WiFi
        (mac80211) and netfilter trees.
      
        Current release - regressions:
      
         - mac80211: fix starting aggregation sessions on mesh interfaces
      
        Current release - new code bugs:
      
         - sctp: send pmtu probe only if packet loss in Search Complete state
      
         - bnxt_en: add missing periodic PHC overflow check
      
         - devlink: fix phys_port_name of virtual port and merge error
      
         - hns3: change the method of obtaining default ptp cycle
      
         - can: mcba_usb_start(): add missing urb->transfer_dma initialization
      
        Previous releases - regressions:
      
         - set true network header for ECN decapsulation
      
         - mlx5e: RX, avoid possible data corruption w/ relaxed ordering and
           LRO
      
         - phy: re-add check for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54811
           PHY
      
         - sctp: fix return value check in __sctp_rcv_asconf_lookup
      
        Previous releases - always broken:
      
         - bpf:
             - more spectre corner case fixes, introduce a BPF nospec
               instruction for mitigating Spectre v4
             - fix OOB read when printing XDP link fdinfo
             - sockmap: fix cleanup related races
      
         - mac80211: fix enabling 4-address mode on a sta vif after assoc
      
         - can:
             - raw: raw_setsockopt(): fix raw_rcv panic for sock UAF
             - j1939: j1939_session_deactivate(): clarify lifetime of session
               object, avoid UAF
             - fix number of identical memory leaks in USB drivers
      
         - tipc:
             - do not blindly write skb_shinfo frags when doing decryption
             - fix sleeping in tipc accept routine"
      
      * tag 'net-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
        gve: Update MAINTAINERS list
        can: esd_usb2: fix memory leak
        can: ems_usb: fix memory leak
        can: usb_8dev: fix memory leak
        can: mcba_usb_start(): add missing urb->transfer_dma initialization
        can: hi311x: fix a signedness bug in hi3110_cmd()
        MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver
        bpf: Fix leakage due to insufficient speculative store bypass mitigation
        bpf: Introduce BPF nospec instruction for mitigating Spectre v4
        sis900: Fix missing pci_disable_device() in probe and remove
        net: let flow have same hash in two directions
        nfc: nfcsim: fix use after free during module unload
        tulip: windbond-840: Fix missing pci_disable_device() in probe and remove
        sctp: fix return value check in __sctp_rcv_asconf_lookup
        nfc: s3fwrn5: fix undefined parameter values in dev_err()
        net/mlx5: Fix mlx5_vport_tbl_attr chain from u16 to u32
        net/mlx5e: Fix nullptr in mlx5e_hairpin_get_mdev()
        net/mlx5: Unload device upon firmware fatal error
        net/mlx5e: Fix page allocation failure for ptp-RQ over SF
        net/mlx5e: Fix page allocation failure for trap-RQ over SF
        ...
      c7d10223
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · e1dab4c0
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These revert a recent IRQ resources handling modification that turned
        out to be problematic, fix suspend-to-idle handling on AMD platforms
        to take upcoming systems into account properly and fix the retrieval
        of the DPTF attributes of the PCH FIVR.
      
        Specifics:
      
         - Revert recent change of the ACPI IRQ resources handling that
           attempted to improve the ACPI IRQ override selection logic, but
           introduced serious regressions on some systems (Hui Wang).
      
         - Fix up quirks for AMD platforms in the suspend-to-idle support code
           so as to take upcoming systems using uPEP HID AMDI007 into account
           as appropriate (Mario Limonciello).
      
         - Fix the code retrieving DPTF attributes of the PCH FIVR so that it
           agrees on the return data type with the ACPI control method
           evaluated for this purpose (Srinivas Pandruvada)"
      
      * tag 'acpi-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: DPTF: Fix reading of attributes
        Revert "ACPI: resources: Add checks for ACPI IRQ override"
        ACPI: PM: Add support for upcoming AMD uPEP HID AMDI007
      e1dab4c0
    • Linus Torvalds's avatar
      pipe: make pipe writes always wake up readers · 3a34b13a
      Linus Torvalds authored
      Since commit 1b6b26ae ("pipe: fix and clarify pipe write wakeup
      logic") we have sanitized the pipe write logic, and would only try to
      wake up readers if they needed it.
      
      In particular, if the pipe already had data in it before the write,
      there was no point in trying to wake up a reader, since any existing
      readers must have been aware of the pre-existing data already.  Doing
      extraneous wakeups will only cause potential thundering herd problems.
      
      However, it turns out that some Android libraries have misused the EPOLL
      interface, and expected "edge triggered" be to "any new write will
      trigger it".  Even if there was no edge in sight.
      
      Quoting Sandeep Patil:
       "The commit 1b6b26ae ('pipe: fix and clarify pipe write wakeup
        logic') changed pipe write logic to wakeup readers only if the pipe
        was empty at the time of write. However, there are libraries that
        relied upon the older behavior for notification scheme similar to
        what's described in [1]
      
        One such library 'realm-core'[2] is used by numerous Android
        applications. The library uses a similar notification mechanism as GNU
        Make but it never drains the pipe until it is full. When Android moved
        to v5.10 kernel, all applications using this library stopped working.
      
        The library has since been fixed[3] but it will be a while before all
        applications incorporate the updated library"
      
      Our regression rule for the kernel is that if applications break from
      new behavior, it's a regression, even if it was because the application
      did something patently wrong.  Also note the original report [4] by
      Michal Kerrisk about a test for this epoll behavior - but at that point
      we didn't know of any actual broken use case.
      
      So add the extraneous wakeup, to approximate the old behavior.
      
      [ I say "approximate", because the exact old behavior was to do a wakeup
        not for each write(), but for each pipe buffer chunk that was filled
        in. The behavior introduced by this change is not that - this is just
        "every write will cause a wakeup, whether necessary or not", which
        seems to be sufficient for the broken library use. ]
      
      It's worth noting that this adds the extraneous wakeup only for the
      write side, while the read side still considers the "edge" to be purely
      about reading enough from the pipe to allow further writes.
      
      See commit f467a6a6 ("pipe: fix and clarify pipe read wakeup logic")
      for the pipe read case, which remains that "only wake up if the pipe was
      full, and we read something from it".
      
      Link: https://lore.kernel.org/lkml/CAHk-=wjeG0q1vgzu4iJhW5juPkTsjTYmiqiMUYAebWW+0bam6w@mail.gmail.com/ [1]
      Link: https://github.com/realm/realm-core [2]
      Link: https://github.com/realm/realm-core/issues/4666 [3]
      Link: https://lore.kernel.org/lkml/CAKgNAkjMBGeAwF=2MKK758BhxvW58wYTgYKB2V-gY1PwXxrH+Q@mail.gmail.com/ [4]
      Link: https://lore.kernel.org/lkml/20210729222635.2937453-1-sspatil@android.com/Reported-by: default avatarSandeep Patil <sspatil@android.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3a34b13a
    • Rafael J. Wysocki's avatar
      Merge branches 'acpi-resources' and 'acpi-dptf' · e83f54ea
      Rafael J. Wysocki authored
      * acpi-resources:
        Revert "ACPI: resources: Add checks for ACPI IRQ override"
      
      * acpi-dptf:
        ACPI: DPTF: Fix reading of attributes
      e83f54ea
    • Linus Torvalds's avatar
      Merge tag 'block-5.14-2021-07-30' of git://git.kernel.dk/linux-block · 4669e13c
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - gendisk freeing fix (Christoph)
      
       - blk-iocost wake ordering fix (Tejun)
      
       - tag allocation error handling fix (John)
      
       - loop locking fix. While this isn't the prettiest fix in the world,
         nobody has any good alternatives for 5.14. Something to likely
         revisit for 5.15. (Tetsuo)
      
      * tag 'block-5.14-2021-07-30' of git://git.kernel.dk/linux-block:
        block: delay freeing the gendisk
        blk-iocost: fix operation ordering in iocg_wake_fn()
        blk-mq-sched: Fix blk_mq_sched_alloc_tags() error handling
        loop: reintroduce global lock for safe loop_validate_file() traversal
      4669e13c
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.14-2021-07-30' of git://git.kernel.dk/linux-block · 27eb687b
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - A fix for block backed reissue (me)
      
       - Reissue context hardening (me)
      
       - Async link locking fix (Pavel)
      
      * tag 'io_uring-5.14-2021-07-30' of git://git.kernel.dk/linux-block:
        io_uring: fix poll requests leaking second poll entries
        io_uring: don't block level reissue off completion path
        io_uring: always reissue from task_work context
        io_uring: fix race in unified task_work running
        io_uring: fix io_prep_async_link locking
      27eb687b
    • Linus Torvalds's avatar
      Merge tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-block · f6c5971b
      Linus Torvalds authored
      Pull libata fixlets from Jens Axboe:
      
       - A fix for PIO highmem (Christoph)
      
       - Kill HAVE_IDE as it's now unused (Lukas)
      
      * tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-block:
        arch: Kconfig: clean up obsolete use of HAVE_IDE
        libata: fix ata_pio_sector for CONFIG_HIGHMEM
      f6c5971b
    • Linus Torvalds's avatar
      Merge tag 'for-5.14-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 051df241
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
      
       - fix -Warray-bounds warning, to help external patchset to make it
         default treewide
      
       - fix writeable device accounting (syzbot report)
      
       - fix fsync and log replay after a rename and inode eviction
      
       - fix potentially lost error code when submitting multiple bios for
         compressed range
      
      * tag 'for-5.14-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: calculate number of eb pages properly in csum_tree_block
        btrfs: fix rw device counting in __btrfs_free_extra_devids
        btrfs: fix lost inode on log replay after mix of fsync, rename and inode eviction
        btrfs: mark compressed range uptodate only if all bio succeed
      051df241
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · 8723bc8f
      Linus Torvalds authored
      Pull HID fixes from Jiri Kosina:
      
       - resume timing fix for intel-ish driver (Ye Xiang)
      
       - fix for using incorrect MMIO register in amd_sfh driver (Dylan
         MacKenzie)
      
       - Cintiq 24HDT / 27QHDT regression fix and touch processing fix for
         Wacom driver (Jason Gerecke)
      
       - device removal bugfix for ft260 driver (Michael Zaidman)
      
       - other small assorted fixes
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: ft260: fix device removal due to USB disconnect
        HID: wacom: Skip processing of touches with negative slot values
        HID: wacom: Re-enable touch by default for Cintiq 24HDT / 27QHDT
        HID: Kconfig: Fix spelling mistake "Uninterruptable" -> "Uninterruptible"
        HID: apple: Add support for Keychron K1 wireless keyboard
        HID: fix typo in Kconfig
        HID: ft260: fix format type warning in ft260_word_show()
        HID: amd_sfh: Use correct MMIO register for DMA address
        HID: asus: Remove check for same LED brightness on set
        HID: intel-ish-hid: use async resume function
      8723bc8f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · ad6ec09d
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "7 patches.
      
        Subsystems affected by this patch series: lib, ocfs2, and mm (slub,
        migration, and memcg)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook()
        slub: fix unreclaimable slab stat for bulk free
        mm/migrate: fix NR_ISOLATED corruption on 64-bit
        mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
        ocfs2: issue zeroout to EOF blocks
        ocfs2: fix zero out valid data
        lib/test_string.c: move string selftest in the Runtime Testing menu
      ad6ec09d
    • Jakub Kicinski's avatar
      Merge tag 'linux-can-fixes-for-5.14-20210730' of... · 8d670412
      Jakub Kicinski authored
      Merge tag 'linux-can-fixes-for-5.14-20210730' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2021-07-30
      
      The first patch is by me and adds Yasushi SHOJI as a reviewer for the
      Microchip CAN BUS Analyzer Tool driver.
      
      Dan Carpenter's patch fixes a signedness bug in the hi311x driver.
      
      Pavel Skripkin provides 4 patches, the first targets the mcba_usb
      driver by adding the missing urb->transfer_dma initialization, which
      was broken in a previous commit. The last 3 patches fix a memory leak
      in the usb_8dev, ems_usb and esd_usb2 driver.
      
      * tag 'linux-can-fixes-for-5.14-20210730' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
        can: esd_usb2: fix memory leak
        can: ems_usb: fix memory leak
        can: usb_8dev: fix memory leak
        can: mcba_usb_start(): add missing urb->transfer_dma initialization
        can: hi311x: fix a signedness bug in hi3110_cmd()
        MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver
      ====================
      
      Link: https://lore.kernel.org/r/20210730070526.1699867-1-mkl@pengutronix.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8d670412
    • Wang Hai's avatar
      mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook() · 121dffe2
      Wang Hai authored
      When I use kfree_rcu() to free a large memory allocated by kmalloc_node(),
      the following dump occurs.
      
        BUG: kernel NULL pointer dereference, address: 0000000000000020
        [...]
        Oops: 0000 [#1] SMP
        [...]
        Workqueue: events kfree_rcu_work
        RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline]
        RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline]
        RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363
        [...]
        Call Trace:
          kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293
          kfree_bulk include/linux/slab.h:413 [inline]
          kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300
          process_one_work+0x207/0x530 kernel/workqueue.c:2276
          worker_thread+0x320/0x610 kernel/workqueue.c:2422
          kthread+0x13d/0x160 kernel/kthread.c:313
          ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
      
      When kmalloc_node() a large memory, page is allocated, not slab, so when
      freeing memory via kfree_rcu(), this large memory should not be used by
      memcg_slab_free_hook(), because memcg_slab_free_hook() is is used for
      slab.
      
      Using page_objcgs_check() instead of page_objcgs() in
      memcg_slab_free_hook() to fix this bug.
      
      Link: https://lkml.kernel.org/r/20210728145655.274476-1-wanghai38@huawei.com
      Fixes: 270c6a71 ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data")
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarRoman Gushchin <guro@fb.com>
      Reviewed-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      121dffe2
    • Shakeel Butt's avatar
      slub: fix unreclaimable slab stat for bulk free · f227f0fa
      Shakeel Butt authored
      SLUB uses page allocator for higher order allocations and update
      unreclaimable slab stat for such allocations.  At the moment, the bulk
      free for SLUB does not share code with normal free code path for these
      type of allocations and have missed the stat update.  So, fix the stat
      update by common code.  The user visible impact of the bug is the
      potential of inconsistent unreclaimable slab stat visible through
      meminfo and vmstat.
      
      Link: https://lkml.kernel.org/r/20210728155354.3440560-1-shakeelb@google.com
      Fixes: 6a486c0a ("mm, sl[ou]b: improve memory accounting")
      Signed-off-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarRoman Gushchin <guro@fb.com>
      Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f227f0fa
    • Aneesh Kumar K.V's avatar
      mm/migrate: fix NR_ISOLATED corruption on 64-bit · b5916c02
      Aneesh Kumar K.V authored
      Similar to commit 2da9f630 ("mm/vmscan: fix NR_ISOLATED_FILE
      corruption on 64-bit") avoid using unsigned int for nr_pages.  With
      unsigned int type the large unsigned int converts to a large positive
      signed long.
      
      Symptoms include CMA allocations hanging forever due to
      alloc_contig_range->...->isolate_migratepages_block waiting forever in
      "while (unlikely(too_many_isolated(pgdat)))".
      
      Link: https://lkml.kernel.org/r/20210728042531.359409-1-aneesh.kumar@linux.ibm.com
      Fixes: c5fc5c3a ("mm: migrate: account THP NUMA migration counters correctly")
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Reported-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reported-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: default avatarYang Shi <shy828301@gmail.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: David Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b5916c02
    • Johannes Weiner's avatar
      mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code · 30def935
      Johannes Weiner authored
      Dan Carpenter reports:
      
          The patch 2d146aa3: "mm: memcontrol: switch to rstat" from Apr
          29, 2021, leads to the following static checker warning:
      
      	    kernel/cgroup/rstat.c:200 cgroup_rstat_flush()
      	    warn: sleeping in atomic context
      
          mm/memcontrol.c
            3572  static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
            3573  {
            3574          unsigned long val;
            3575
            3576          if (mem_cgroup_is_root(memcg)) {
            3577                  cgroup_rstat_flush(memcg->css.cgroup);
      			    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      
          This is from static analysis and potentially a false positive.  The
          problem is that mem_cgroup_usage() is called from __mem_cgroup_threshold()
          which holds an rcu_read_lock().  And the cgroup_rstat_flush() function
          can sleep.
      
            3578                  val = memcg_page_state(memcg, NR_FILE_PAGES) +
            3579                          memcg_page_state(memcg, NR_ANON_MAPPED);
            3580                  if (swap)
            3581                          val += memcg_page_state(memcg, MEMCG_SWAP);
            3582          } else {
            3583                  if (!swap)
            3584                          val = page_counter_read(&memcg->memory);
            3585                  else
            3586                          val = page_counter_read(&memcg->memsw);
            3587          }
            3588          return val;
            3589  }
      
      __mem_cgroup_threshold() indeed holds the rcu lock.  In addition, the
      thresholding code is invoked during stat changes, and those contexts
      have irqs disabled as well.  If the lock breaking occurs inside the
      flush function, it will result in a sleep from an atomic context.
      
      Use the irqsafe flushing variant in mem_cgroup_usage() to fix this.
      
      Link: https://lkml.kernel.org/r/20210726150019.251820-1-hannes@cmpxchg.org
      Fixes: 2d146aa3 ("mm: memcontrol: switch to rstat")
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarChris Down <chris@chrisdown.name>
      Reviewed-by: default avatarRik van Riel <riel@surriel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      30def935
    • Junxiao Bi's avatar
      ocfs2: issue zeroout to EOF blocks · 9449ad33
      Junxiao Bi authored
      For punch holes in EOF blocks, fallocate used buffer write to zero the
      EOF blocks in last cluster.  But since ->writepage will ignore EOF
      pages, those zeros will not be flushed.
      
      This "looks" ok as commit 6bba4471 ("ocfs2: fix data corruption by
      fallocate") will zero the EOF blocks when extend the file size, but it
      isn't.  The problem happened on those EOF pages, before writeback, those
      pages had DIRTY flag set and all buffer_head in them also had DIRTY flag
      set, when writeback run by write_cache_pages(), DIRTY flag on the page
      was cleared, but DIRTY flag on the buffer_head not.
      
      When next write happened to those EOF pages, since buffer_head already
      had DIRTY flag set, it would not mark page DIRTY again.  That made
      writeback ignore them forever.  That will cause data corruption.  Even
      directio write can't work because it will fail when trying to drop pages
      caches before direct io, as it found the buffer_head for those pages
      still had DIRTY flag set, then it will fall back to buffer io mode.
      
      To make a summary of the issue, as writeback ingores EOF pages, once any
      EOF page is generated, any write to it will only go to the page cache,
      it will never be flushed to disk even file size extends and that page is
      not EOF page any more.  The fix is to avoid zero EOF blocks with buffer
      write.
      
      The following code snippet from qemu-img could trigger the corruption.
      
        656   open("6b3711ae-3306-4bdd-823c-cf1c0060a095.conv.2", O_RDWR|O_DIRECT|O_CLOEXEC) = 11
        ...
        660   fallocate(11, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2275868672, 327680 <unfinished ...>
        660   fallocate(11, 0, 2275868672, 327680) = 0
        658   pwrite64(11, "
      
      Link: https://lkml.kernel.org/r/20210722054923.24389-2-junxiao.bi@oracle.comSigned-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9449ad33
    • Junxiao Bi's avatar
      ocfs2: fix zero out valid data · f267aeb6
      Junxiao Bi authored
      If append-dio feature is enabled, direct-io write and fallocate could
      run in parallel to extend file size, fallocate used "orig_isize" to
      record i_size before taking "ip_alloc_sem", when
      ocfs2_zeroout_partial_cluster() zeroout EOF blocks, i_size maybe already
      extended by ocfs2_dio_end_io_write(), that will cause valid data zeroed
      out.
      
      Link: https://lkml.kernel.org/r/20210722054923.24389-1-junxiao.bi@oracle.com
      Fixes: 6bba4471 ("ocfs2: fix data corruption by fallocate")
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f267aeb6
    • Matteo Croce's avatar
      lib/test_string.c: move string selftest in the Runtime Testing menu · b2ff70a0
      Matteo Croce authored
      STRING_SELFTEST is presented in the "Library routines" menu.  Move it in
      Kernel hacking > Kernel Testing and Coverage > Runtime Testing together
      with other similar tests found in lib/
      
      	--- Runtime Testing
      	<*>   Test functions located in the hexdump module at runtime
      	<*>   Test string functions (NEW)
      	<*>   Test functions located in the string_helpers module at runtime
      	<*>   Test strscpy*() family of functions at runtime
      	<*>   Test kstrto*() family of functions at runtime
      	<*>   Test printf() family of functions at runtime
      	<*>   Test scanf() family of functions at runtime
      
      Link: https://lkml.kernel.org/r/20210719185158.190371-1-mcroce@linux.microsoft.comSigned-off-by: default avatarMatteo Croce <mcroce@microsoft.com>
      Cc: Peter Rosin <peda@axentia.se>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b2ff70a0
    • Catherine Sullivan's avatar
      gve: Update MAINTAINERS list · 028a7177
      Catherine Sullivan authored
      The team maintaining the gve driver has undergone some changes,
      this updates the MAINTAINERS file accordingly.
      Signed-off-by: default avatarCatherine Sullivan <csully@google.com>
      Signed-off-by: default avatarJon Olson <jonolson@google.com>
      Signed-off-by: default avatarDavid Awogbemila <awogbemila@google.com>
      Signed-off-by: default avatarJeroen de Borst <jeroendb@google.com>
      Link: https://lore.kernel.org/r/20210729155258.442650-1-csully@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      028a7177
    • Lukas Bulwahn's avatar
      arch: Kconfig: clean up obsolete use of HAVE_IDE · 094121ef
      Lukas Bulwahn authored
      The arch-specific Kconfig files use HAVE_IDE to indicate if IDE is
      supported.
      
      As IDE support and the HAVE_IDE config vanishes with commit b7fb14d3
      ("ide: remove the legacy ide driver"), there is no need to mention
      HAVE_IDE in all those arch-specific Kconfig files.
      
      The issue was identified with ./scripts/checkkconfigsymbols.py.
      
      Fixes: b7fb14d3 ("ide: remove the legacy ide driver")
      Suggested-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Link: https://lore.kernel.org/r/20210728182115.4401-1-lukas.bulwahn@gmail.comReviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      094121ef
    • Pavel Skripkin's avatar
      can: esd_usb2: fix memory leak · 928150fa
      Pavel Skripkin authored
      In esd_usb2_setup_rx_urbs() MAX_RX_URBS coherent buffers are allocated
      and there is nothing, that frees them:
      
      1) In callback function the urb is resubmitted and that's all
      2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
         is not set (see esd_usb2_setup_rx_urbs) and this flag cannot be used
         with coherent buffers.
      
      So, all allocated buffers should be freed with usb_free_coherent()
      explicitly.
      
      Side note: This code looks like a copy-paste of other can drivers. The
      same patch was applied to mcba_usb driver and it works nice with real
      hardware. There is no change in functionality, only clean-up code for
      coherent buffers.
      
      Fixes: 96d8e903 ("can: Add driver for esd CAN-USB/2 device")
      Link: https://lore.kernel.org/r/b31b096926dcb35998ad0271aac4b51770ca7cc8.1627404470.git.paskripkin@gmail.com
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      928150fa
    • Pavel Skripkin's avatar
      can: ems_usb: fix memory leak · 9969e3c5
      Pavel Skripkin authored
      In ems_usb_start() MAX_RX_URBS coherent buffers are allocated and
      there is nothing, that frees them:
      
      1) In callback function the urb is resubmitted and that's all
      2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
         is not set (see ems_usb_start) and this flag cannot be used with
         coherent buffers.
      
      So, all allocated buffers should be freed with usb_free_coherent()
      explicitly.
      
      Side note: This code looks like a copy-paste of other can drivers. The
      same patch was applied to mcba_usb driver and it works nice with real
      hardware. There is no change in functionality, only clean-up code for
      coherent buffers.
      
      Fixes: 702171ad ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface")
      Link: https://lore.kernel.org/r/59aa9fbc9a8cbf9af2bbd2f61a659c480b415800.1627404470.git.paskripkin@gmail.com
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      9969e3c5
    • Pavel Skripkin's avatar
      can: usb_8dev: fix memory leak · 0e865f0c
      Pavel Skripkin authored
      In usb_8dev_start() MAX_RX_URBS coherent buffers are allocated and
      there is nothing, that frees them:
      
      1) In callback function the urb is resubmitted and that's all
      2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER
         is not set (see usb_8dev_start) and this flag cannot be used with
         coherent buffers.
      
      So, all allocated buffers should be freed with usb_free_coherent()
      explicitly.
      
      Side note: This code looks like a copy-paste of other can drivers. The
      same patch was applied to mcba_usb driver and it works nice with real
      hardware. There is no change in functionality, only clean-up code for
      coherent buffers.
      
      Fixes: 0024d8ad ("can: usb_8dev: Add support for USB2CAN interface from 8 devices")
      Link: https://lore.kernel.org/r/d39b458cd425a1cf7f512f340224e6e9563b07bd.1627404470.git.paskripkin@gmail.com
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      0e865f0c
    • Pavel Skripkin's avatar
      can: mcba_usb_start(): add missing urb->transfer_dma initialization · fc43fb69
      Pavel Skripkin authored
      Yasushi reported, that his Microchip CAN Analyzer stopped working
      since commit 91c02557 ("can: mcba_usb: fix memory leak in
      mcba_usb"). The problem was in missing urb->transfer_dma
      initialization.
      
      In my previous patch to this driver I refactored mcba_usb_start() code
      to avoid leaking usb coherent buffers. To archive it, I passed local
      stack variable to usb_alloc_coherent() and then saved it to private
      array to correctly free all coherent buffers on ->close() call. But I
      forgot to initialize urb->transfer_dma with variable passed to
      usb_alloc_coherent().
      
      All of this was causing device to not work, since dma addr 0 is not
      valid and following log can be found on bug report page, which points
      exactly to problem described above.
      
      | DMAR: [DMA Write] Request device [00:14.0] PASID ffffffff fault addr 0 [fault reason 05] PTE Write access is not set
      
      Fixes: 91c02557 ("can: mcba_usb: fix memory leak in mcba_usb")
      Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=990850
      Link: https://lore.kernel.org/r/20210725103630.23864-1-paskripkin@gmail.com
      Cc: linux-stable <stable@vger.kernel.org>
      Reported-by: default avatarYasushi SHOJI <yasushi.shoji@gmail.com>
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Tested-by: default avatarYasushi SHOJI <yashi@spacecubics.com>
      [mkl: fixed typos in commit message - thanks Yasushi SHOJI]
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      fc43fb69
    • Dan Carpenter's avatar
      can: hi311x: fix a signedness bug in hi3110_cmd() · f6b3c784
      Dan Carpenter authored
      The hi3110_cmd() is supposed to return zero on success and negative
      error codes on failure, but it was accidentally declared as a u8 when
      it needs to be an int type.
      
      Fixes: 57e83fb9 ("can: hi311x: Add Holt HI-311x CAN driver")
      Link: https://lore.kernel.org/r/20210729141246.GA1267@kiliSigned-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      f6b3c784
    • Marc Kleine-Budde's avatar
      MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver · 8a7b46fa
      Marc Kleine-Budde authored
      This patch adds Yasushi SHOJI as a reviewer for the Microchip CAN BUS
      Analyzer Tool driver.
      
      Link: https://lore.kernel.org/r/20210726111619.1023991-1-mkl@pengutronix.deAcked-by: default avatarYasushi SHOJI <yashi@spacecubics.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      8a7b46fa
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2021-07-30' of git://anongit.freedesktop.org/drm/drm · 764a5bc8
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Regular drm fixes pull, seems about the right size, lots of small
        fixes across the board, mostly amdgpu, but msm and i915 are in there
        along with panel and ttm.
      
        amdgpu:
         - Fix resource leak in an error path
         - Avoid stack contents exposure in error path
         - pmops check fix for S0ix vs S3
         - DCN 2.1 display fixes
         - DCN 2.0 display fix
         - Backlight control fix for laptops with HDR panels
         - Maintainers updates
      
        i915:
         - Fix vbt port mask
         - Fix around reading the right DSC disable fuse in display_ver 10
         - Split display version 9 and 10 in intel_setup_outputs
      
        msm:
         - iommu fault display fix
         - misc dp compliance fixes
         - dpu reg sizing fix
      
        panel:
         - Fix bpc for ytc700tlag_05_201c
      
        ttm:
         - debugfs init fixes"
      
      * tag 'drm-fixes-2021-07-30' of git://anongit.freedesktop.org/drm/drm:
        maintainers: add bugs and chat URLs for amdgpu
        drm/amdgpu/display: only enable aux backlight control for OLED panels
        drm/amd/display: ensure dentist display clock update finished in DCN20
        drm/amd/display: Add missing DCN21 IP parameter
        drm/amd/display: Guard DST_Y_PREFETCH register overflow in DCN21
        drm/amdgpu: Check pmops for desired suspend state
        drm/msm/dp: Initialize dp->aux->drm_dev before registration
        drm/msm/dp: signal audio plugged change at dp_pm_resume
        drm/msm/dp: Initialize the INTF_CONFIG register
        drm/msm/dp: use dp_ctrl_off_link_stream during PHY compliance test run
        drm/msm: Fix display fault handling
        drm/msm/dpu: Fix sm8250_mdp register length
        drm/amdgpu: Avoid printing of stack contents on firmware load error
        drm/amdgpu: Fix resource leak on probe error path
        drm/i915/display: split DISPLAY_VER 9 and 10 in intel_setup_outputs()
        drm/i915: fix not reading DSC disable fuse in GLK
        drm/i915/bios: Fix ports mask
        drm/panel: panel-simple: Fix proper bpc for ytc700tlag_05_201c
        drm/ttm: Initialize debugfs from ttm_global_init()
      764a5bc8
    • Linus Torvalds's avatar
      Merge tag 'fallthrough-fixes-clang-5.14-rc4' of... · c71a2f65
      Linus Torvalds authored
      Merge tag 'fallthrough-fixes-clang-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
      
      Pull fallthrough fixes from Gustavo Silva:
       "Fix some fall-through warnings when building with Clang and
        '-Wimplicit-fallthrough' on ARM"
      
      * tag 'fallthrough-fixes-clang-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
        scsi: fas216: Fix fall-through warning for Clang
        scsi: acornscsi: Fix fall-through warning for clang
        ARM: riscpc: Fix fall-through warning for Clang
      c71a2f65
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha · cade08a5
      Linus Torvalds authored
      Pull alpha updates from Matt Turner:
       "They're mostly small janitorial fixes but there's also more important
        ones:
      
         - drop the alpha-specific x86 binary loader (David Hildenbrand)
      
         - regression fix for at least Marvel platforms (Mike Rapoport)
      
         - fix for a scary-looking typo (Zheng Yongjun)"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
        alpha: register early reserved memory in memblock
        alpha: fix spelling mistakes
        alpha: Remove space between * and parameter name
        alpha: fp_emul: avoid init/cleanup_module names
        alpha: Add syscall_get_return_value()
        binfmt: remove support for em86 (alpha only)
        alpha: fix typos in a comment
        alpha: defconfig: add necessary configs for boot testing
        alpha: Send stop IPI to send to online CPUs
        alpha: convert comma to semicolon
        alpha: remove undef inline in compiler.h
        alpha: Kconfig: Replace HTTP links with HTTPS ones
        alpha: __udiv_qrnnd should be exported
      cade08a5
  4. 29 Jul, 2021 9 commits
    • Gustavo A. R. Silva's avatar
      scsi: fas216: Fix fall-through warning for Clang · cb163627
      Gustavo A. R. Silva authored
      Fix the following fallthrough warning (on ARM):
      
      drivers/scsi/arm/fas216.c:1379:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
                 default:
                 ^
         drivers/scsi/arm/fas216.c:1379:2: note: insert 'break;' to avoid fall-through
                 default:
                 ^
                 break;
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Link: https://lore.kernel.org/lkml/202107260355.bF00i5bi-lkp@intel.com/Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      cb163627
    • Gustavo A. R. Silva's avatar
      scsi: acornscsi: Fix fall-through warning for clang · eb4f520c
      Gustavo A. R. Silva authored
      Fix the following fallthrough warning (on ARM):
      
      drivers/scsi/arm/acornscsi.c:2651:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
                 case res_success:
                 ^
         drivers/scsi/arm/acornscsi.c:2651:2: note: insert '__attribute__((fallthrough));' to silence this warning
                 case res_success:
                 ^
                 __attribute__((fallthrough));
         drivers/scsi/arm/acornscsi.c:2651:2: note: insert 'break;' to avoid fall-through
                 case res_success:
                 ^
                 break;
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Link: https://lore.kernel.org/lkml/202107260355.bF00i5bi-lkp@intel.com/Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      eb4f520c
    • Gustavo A. R. Silva's avatar
      ARM: riscpc: Fix fall-through warning for Clang · 696e572d
      Gustavo A. R. Silva authored
      Fix the following fallthrough warning:
      
      arch/arm/mach-rpc/riscpc.c:52:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
                 default:
                 ^
      arch/arm/mach-rpc/riscpc.c:52:2: note: insert 'break;' to avoid fall-through
                 default:
                 ^
                 break;
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Link: https://lore.kernel.org/lkml/202107260355.bF00i5bi-lkp@intel.com/Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      696e572d
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 7e96bf47
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "ARM:
      
         - Fix MTE shared page detection
      
         - Enable selftest's use of PMU registers when asked to
      
        s390:
      
         - restore 5.13 debugfs names
      
        x86:
      
         - fix sizes for vcpu-id indexed arrays
      
         - fixes for AMD virtualized LAPIC (AVIC)
      
         - other small bugfixes
      
        Generic:
      
         - access tracking performance test
      
         - dirty_log_perf_test command line parsing fix
      
         - Fix selftest use of obsolete pthread_yield() in favour of
           sched_yield()
      
         - use cpu_relax when halt polling
      
         - fixed missing KVM_CLEAR_DIRTY_LOG compat ioctl"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: add missing compat KVM_CLEAR_DIRTY_LOG
        KVM: use cpu_relax when halt polling
        KVM: SVM: use vmcb01 in svm_refresh_apicv_exec_ctrl
        KVM: SVM: tweak warning about enabled AVIC on nested entry
        KVM: SVM: svm_set_vintr don't warn if AVIC is active but is about to be deactivated
        KVM: s390: restore old debugfs names
        KVM: SVM: delay svm_vcpu_init_msrpm after svm->vmcb is initialized
        KVM: selftests: Introduce access_tracking_perf_test
        KVM: selftests: Fix missing break in dirty_log_perf_test arg parsing
        x86/kvm: fix vcpu-id indexed array sizes
        KVM: x86: Check the right feature bit for MSR_KVM_ASYNC_PF_ACK access
        docs: virt: kvm: api.rst: replace some characters
        KVM: Documentation: Fix KVM_CAP_ENFORCE_PV_FEATURE_CPUID name
        KVM: nSVM: Swap the parameter order for svm_copy_vmrun_state()/svm_copy_vmloadsave_state()
        KVM: nSVM: Rename nested_svm_vmloadsave() to svm_copy_vmloadsave_state()
        KVM: arm64: selftests: get-reg-list: actually enable pmu regs in pmu sublist
        KVM: selftests: change pthread_yield to sched_yield
        KVM: arm64: Fix detection of shared VMAs on guest fault
      7e96bf47
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu · 2b99c470
      Linus Torvalds authored
      Pull m68knommu fix from Greg Ungerer:
       "A single compile time fix"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
        m68k/coldfire: change pll var. to clk_pll
      2b99c470
    • Darrick J. Wong's avatar
      xfs: prevent spoofing of rtbitmap blocks when recovering buffers · 81a448d7
      Darrick J. Wong authored
      While reviewing the buffer item recovery code, the thought occurred to
      me: in V5 filesystems we use log sequence number (LSN) tracking to avoid
      replaying older metadata updates against newer log items.  However, we
      use the magic number of the ondisk buffer to find the LSN of the ondisk
      metadata, which means that if an attacker can control the layout of the
      realtime device precisely enough that the start of an rt bitmap block
      matches the magic and UUID of some other kind of block, they can control
      the purported LSN of that spoofed block and thereby break log replay.
      
      Since realtime bitmap and summary blocks don't have headers at all, we
      have no way to tell if a block really should be replayed.  The best we
      can do is replay unconditionally and hope for the best.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarCarlos Maiolino <cmaiolino@redhat.com>
      81a448d7
    • Dave Chinner's avatar
      xfs: limit iclog tail updates · 9d110014
      Dave Chinner authored
      From the department of "generic/482 keeps on giving", we bring you
      another tail update race condition:
      
      iclog:
      	S1			C1
      	+-----------------------+-----------------------+
      				 S2			EOIC
      
      Two checkpoints in a single iclog. One is complete, the other just
      contains the start record and overruns into a new iclog.
      
      Timeline:
      
      Before S1:	Cache flush, log tail = X
      At S1:		Metadata stable, write start record and checkpoint
      At C1:		Write commit record, set NEED_FUA
      		Single iclog checkpoint, so no need for NEED_FLUSH
      		Log tail still = X, so no need for NEED_FLUSH
      
      After C1,
      Before S2:	Cache flush, log tail = X
      At S2:		Metadata stable, write start record and checkpoint
      After S2:	Log tail moves to X+1
      At EOIC:	End of iclog, more journal data to write
      		Releases iclog
      		Not a commit iclog, so no need for NEED_FLUSH
      		Writes log tail X+1 into iclog.
      
      At this point, the iclog has tail X+1 and NEED_FUA set. There has
      been no cache flush for the metadata between X and X+1, and the
      iclog writes the new tail permanently to the log. THis is sufficient
      to violate on disk metadata/journal ordering.
      
      We have two options here. The first is to detect this case in some
      manner and ensure that the partial checkpoint write sets NEED_FLUSH
      when the iclog is already marked NEED_FUA and the log tail changes.
      This seems somewhat fragile and quite complex to get right, and it
      doesn't actually make it obvious what underlying problem it is
      actually addressing from reading the code.
      
      The second option seems much cleaner to me, because it is derived
      directly from the requirements of the C1 commit record in the iclog.
      That is, when we write this commit record to the iclog, we've
      guaranteed that the metadata/data ordering is correct for tail
      update purposes. Hence if we only write the log tail into the iclog
      for the *first* commit record rather than the log tail at the last
      release, we guarantee that the log tail does not move past where the
      the first commit record in the log expects it to be.
      
      IOWs, taking the first option means that replay of C1 becomes
      dependent on future operations doing the right thing, not just the
      C1 checkpoint itself doing the right thing. This makes log recovery
      almost impossible to reason about because now we have to take into
      account what might or might not have happened in the future when
      looking at checkpoints in the log rather than just having to
      reconstruct the past...
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      9d110014
    • Dave Chinner's avatar
      xfs: need to see iclog flags in tracing · b2ae3a9e
      Dave Chinner authored
      Because I cannot tell if the NEED_FLUSH flag is being set correctly
      by the log force and CIL push machinery without it.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      b2ae3a9e
    • Dave Chinner's avatar
      xfs: Enforce attr3 buffer recovery order · d8f4c2d0
      Dave Chinner authored
      From the department of "WTAF? How did we miss that!?"...
      
      When we are recovering a buffer, the first thing we do is check the
      buffer magic number and extract the LSN from the buffer. If the LSN
      is older than the current LSN, we replay the modification to it. If
      the metadata on disk is newer than the transaction in the log, we
      skip it. This is a fundamental v5 filesystem metadata recovery
      behaviour.
      
      generic/482 failed with an attribute writeback failure during log
      recovery. The write verifier caught the corruption before it got
      written to disk, and the attr buffer dump looked like:
      
      XFS (dm-3): Metadata corruption detected at xfs_attr3_leaf_verify+0x275/0x2e0, xfs_attr3_leaf block 0x19be8
      XFS (dm-3): Unmount and run xfs_repair
      XFS (dm-3): First 128 bytes of corrupted metadata buffer:
      00000000: 00 00 00 00 00 00 00 00 3b ee 00 00 4d 2a 01 e1  ........;...M*..
      00000010: 00 00 00 00 00 01 9b e8 00 00 00 01 00 00 05 38  ...............8
                                        ^^^^^^^^^^^^^^^^^^^^^^^
      00000020: df 39 5e 51 58 ac 44 b6 8d c5 e7 10 44 09 bc 17  .9^QX.D.....D...
      00000030: 00 00 00 00 00 02 00 83 00 03 00 cc 0f 24 01 00  .............$..
      00000040: 00 68 0e bc 0f c8 00 10 00 00 00 00 00 00 00 00  .h..............
      00000050: 00 00 3c 31 0f 24 01 00 00 00 3c 32 0f 88 01 00  ..<1.$....<2....
      00000060: 00 00 3c 33 0f d8 01 00 00 00 00 00 00 00 00 00  ..<3............
      00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      .....
      
      The highlighted bytes are the LSN that was replayed into the
      buffer: 0x100000538. This is cycle 1, block 0x538. Prior to replay,
      that block on disk looks like this:
      
      $ sudo xfs_db -c "fsb 0x417d" -c "type attr3" -c p /dev/mapper/thin-vol
      hdr.info.hdr.forw = 0
      hdr.info.hdr.back = 0
      hdr.info.hdr.magic = 0x3bee
      hdr.info.crc = 0xb5af0bc6 (correct)
      hdr.info.bno = 105448
      hdr.info.lsn = 0x100000900
                     ^^^^^^^^^^^
      hdr.info.uuid = df395e51-58ac-44b6-8dc5-e7104409bc17
      hdr.info.owner = 131203
      hdr.count = 2
      hdr.usedbytes = 120
      hdr.firstused = 3796
      hdr.holes = 1
      hdr.freemap[0-2] = [base,size]
      
      Note the LSN stamped into the buffer on disk: 1/0x900. The version
      on disk is much newer than the log transaction that was being
      replayed. That's a bug, and should -never- happen.
      
      So I immediately went to look at xlog_recover_get_buf_lsn() to check
      that we handled the LSN correctly. I was wondering if there was a
      similar "two commits with the same start LSN skips the second
      replay" problem with buffers. I didn't get that far, because I found
      a much more basic, rudimentary bug: xlog_recover_get_buf_lsn()
      doesn't recognise buffers with XFS_ATTR3_LEAF_MAGIC set in them!!!
      
      IOWs, attr3 leaf buffers fall through the magic number checks
      unrecognised, so trigger the "recover immediately" behaviour instead
      of undergoing an LSN check. IOWs, we incorrectly replay ATTR3 leaf
      buffers and that causes silent on disk corruption of inode attribute
      forks and potentially other things....
      
      Git history shows this is *another* zero day bug, this time
      introduced in commit 50d5c8d8 ("xfs: check LSN ordering for v5
      superblocks during recovery") which failed to handle the attr3 leaf
      buffers in recovery. And we've failed to handle them ever since...
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      d8f4c2d0