1. 10 Jan, 2023 3 commits
    • Tong Zhang's avatar
      nvme-pci: fix error handling in nvme_pci_enable() · 09113abf
      Tong Zhang authored
      There are two issues in nvme_pci_enable():
      
       1) If pci_alloc_irq_vectors() fails, device is left enabled. Fix this by
          adding a goto disable statement.
       2) nvme_pci_configure_admin_queue could return -ENODEV, in this case,
          we will need to free IRQ properly.  Otherwise the following warning
          could be triggered:
      
      [    5.286752] WARNING: CPU: 0 PID: 33 at kernel/irq/irqdomain.c:253 irq_domain_remove+0x12d/0x140
      [    5.290547] Call Trace:
      [    5.290626]  <TASK>
      [    5.290695]  msi_remove_device_irq_domain+0xc9/0xf0
      [    5.290843]  msi_device_data_release+0x15/0x80
      [    5.290978]  release_nodes+0x58/0x90
      [    5.293788] WARNING: CPU: 0 PID: 33 at kernel/irq/msi.c:276 msi_device_data_release+0x76/0x80
      [    5.297573] Call Trace:
      [    5.297651]  <TASK>
      [    5.297719]  release_nodes+0x58/0x90
      [    5.297831]  devres_release_all+0xef/0x140
      [    5.298339]  device_unbind_cleanup+0x11/0xc0
      [    5.298479]  really_probe+0x296/0x320
      
      Fixes: a6ee7f19 ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable")
      Co-developed-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarTong Zhang <ztong0001@gmail.com>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      09113abf
    • Hector Martin's avatar
      nvme-pci: add NVME_QUIRK_IDENTIFY_CNS quirk to Apple T2 controllers · 453116a4
      Hector Martin authored
      This mirrors the quirk added to Apple Silicon controllers in apple.c.
      These controllers do not support the Active NS ID List command and
      behave identically to the SoC version judging by existing user
      reports/syslogs, so will need the same fix. This quirk reverts
      back to NVMe 1.0 behavior and disables the broken commands.
      
      Fixes: 811f4de0 ("nvme: avoid fallback to sequential scan due to transient issues")
      Signed-off-by: default avatarHector Martin <marcan@marcan.st>
      Tested-by: default avatarOrlando Chamberlain <orlandoch.dev@gmail.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      453116a4
    • Hector Martin's avatar
      nvme-apple: add NVME_QUIRK_IDENTIFY_CNS quirk to fix regression · aa96d6aa
      Hector Martin authored
      From the get-go, this driver and the ANS syslog have been complaining
      about namespace identification. In 6.2-rc1, commit 811f4de0 ("nvme:
      avoid fallback to sequential scan due to transient issues") regressed
      the driver by no longer allowing fallback to sequential namespace scans,
      leaving us with no namespaces.
      
      It turns out that the real problem is that this controller claiming
      NVMe 1.1 compat is treating the CNS field as a binary field, as in NVMe
      1.0. This already has a quirk, NVME_QUIRK_IDENTIFY_CNS, so set it for
      the controller to fix all this nonsense (including other errors
      triggered by other CNS commands).
      
      Fixes: 811f4de0 ("nvme: avoid fallback to sequential scan due to transient issues")
      Fixes: 5bd2927a ("nvme-apple: Add initial Apple SoC NVMe driver")
      Signed-off-by: default avatarHector Martin <marcan@marcan.st>
      Reviewed-by: default avatarSven Peter <sven@svenpeter.dev>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      aa96d6aa
  2. 09 Jan, 2023 1 commit
  3. 05 Jan, 2023 1 commit
  4. 04 Jan, 2023 6 commits
  5. 29 Dec, 2022 1 commit
    • Jens Axboe's avatar
      Merge tag 'nvme-6.2-2022-12-29' of git://git.infradead.org/nvme into block-6.2 · 1551ed5a
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "nvme fixes for Linux 6.2
      
       - fix various problems in handling the Command Supported and Effects log
         (Christoph Hellwig)
       - don't allow unprivileged passthrough of commands that don't transfer
         data but modify logical block content (Christoph Hellwig)
       - add a features and quirks policy document (Christoph Hellwig)
       - fix some really nasty code that was correct but made smatch complain
         (Sagi Grimberg)"
      
      * tag 'nvme-6.2-2022-12-29' of git://git.infradead.org/nvme:
        nvme-auth: fix smatch warning complaints
        nvme: consult the CSE log page for unprivileged passthrough
        nvme: also return I/O command effects from nvme_command_effects
        nvmet: don't defer passthrough commands with trivial effects to the workqueue
        nvmet: set the LBCC bit for commands that modify data
        nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it
        nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition
        docs, nvme: add a feature and quirk policy document
      1551ed5a
  6. 28 Dec, 2022 8 commits
  7. 26 Dec, 2022 3 commits
  8. 22 Dec, 2022 2 commits
    • Jens Axboe's avatar
      Merge tag 'nvme-6.2-2022-12-22' of git://git.infradead.org/nvme into block-6.2 · fb857b0b
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "nvme fixes for Linux 6.2
      
       - fix doorbell buffer value endianness (Klaus Jensen)
       - fix Linux vs NVMe page size mismatch (Keith Busch)
       - fix a potential use memory access beyong the allocation limit
         (Keith Busch)
       - fix a multipath vs blktrace NULL pointer dereference
         (Yanjun Zhang)"
      
      * tag 'nvme-6.2-2022-12-22' of git://git.infradead.org/nvme:
        nvme: fix multipath crash caused by flush request when blktrace is enabled
        nvme-pci: fix page size checks
        nvme-pci: fix mempool alloc size
        nvme-pci: fix doorbell buffer value endianness
      fb857b0b
    • Yanjun Zhang's avatar
      nvme: fix multipath crash caused by flush request when blktrace is enabled · 3659fb5a
      Yanjun Zhang authored
      The flush request initialized by blk_kick_flush has NULL bio,
      and it may be dealt with nvme_end_req during io completion.
      When blktrace is enabled, nvme_trace_bio_complete with multipath
      activated trying to access NULL pointer bio from flush request
      results in the following crash:
      
      [ 2517.831677] BUG: kernel NULL pointer dereference, address: 000000000000001a
      [ 2517.835213] #PF: supervisor read access in kernel mode
      [ 2517.838724] #PF: error_code(0x0000) - not-present page
      [ 2517.842222] PGD 7b2d51067 P4D 0
      [ 2517.845684] Oops: 0000 [#1] SMP NOPTI
      [ 2517.849125] CPU: 2 PID: 732 Comm: kworker/2:1H Kdump: loaded Tainted: G S                5.15.67-0.cl9.x86_64 #1
      [ 2517.852723] Hardware name: XFUSION 2288H V6/BC13MBSBC, BIOS 1.13 07/27/2022
      [ 2517.856358] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp]
      [ 2517.859993] RIP: 0010:blk_add_trace_bio_complete+0x6/0x30
      [ 2517.863628] Code: 1f 44 00 00 48 8b 46 08 31 c9 ba 04 00 10 00 48 8b 80 50 03 00 00 48 8b 78 50 e9 e5 fe ff ff 0f 1f 44 00 00 41 54 49 89 f4 55 <0f> b6 7a 1a 48 89 d5 e8 3e 1c 2b 00 48 89 ee 4c 89 e7 5d 89 c1 ba
      [ 2517.871269] RSP: 0018:ff7f6a008d9dbcd0 EFLAGS: 00010286
      [ 2517.875081] RAX: ff3d5b4be00b1d50 RBX: 0000000002040002 RCX: ff3d5b0a270f2000
      [ 2517.878966] RDX: 0000000000000000 RSI: ff3d5b0b021fb9f8 RDI: 0000000000000000
      [ 2517.882849] RBP: ff3d5b0b96a6fa00 R08: 0000000000000001 R09: 0000000000000000
      [ 2517.886718] R10: 000000000000000c R11: 000000000000000c R12: ff3d5b0b021fb9f8
      [ 2517.890575] R13: 0000000002000000 R14: ff3d5b0b021fb1b0 R15: 0000000000000018
      [ 2517.894434] FS:  0000000000000000(0000) GS:ff3d5b42bfc80000(0000) knlGS:0000000000000000
      [ 2517.898299] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 2517.902157] CR2: 000000000000001a CR3: 00000004f023e005 CR4: 0000000000771ee0
      [ 2517.906053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 2517.909930] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 2517.913761] PKRU: 55555554
      [ 2517.917558] Call Trace:
      [ 2517.921294]  <TASK>
      [ 2517.924982]  nvme_complete_rq+0x1c3/0x1e0 [nvme_core]
      [ 2517.928715]  nvme_tcp_recv_pdu+0x4d7/0x540 [nvme_tcp]
      [ 2517.932442]  nvme_tcp_recv_skb+0x4f/0x240 [nvme_tcp]
      [ 2517.936137]  ? nvme_tcp_recv_pdu+0x540/0x540 [nvme_tcp]
      [ 2517.939830]  tcp_read_sock+0x9c/0x260
      [ 2517.943486]  nvme_tcp_try_recv+0x65/0xa0 [nvme_tcp]
      [ 2517.947173]  nvme_tcp_io_work+0x64/0x90 [nvme_tcp]
      [ 2517.950834]  process_one_work+0x1e8/0x390
      [ 2517.954473]  worker_thread+0x53/0x3c0
      [ 2517.958069]  ? process_one_work+0x390/0x390
      [ 2517.961655]  kthread+0x10c/0x130
      [ 2517.965211]  ? set_kthread_struct+0x40/0x40
      [ 2517.968760]  ret_from_fork+0x1f/0x30
      [ 2517.972285]  </TASK>
      
      To avoid this situation, add a NULL check for req->bio before
      calling trace_block_bio_complete.
      Signed-off-by: default avatarYanjun Zhang <zhangyanjun@cestc.cn>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      3659fb5a
  9. 21 Dec, 2022 3 commits
  10. 16 Dec, 2022 1 commit
  11. 15 Dec, 2022 2 commits
  12. 14 Dec, 2022 7 commits
    • Tejun Heo's avatar
      blk-iolatency: Fix memory leak on add_disk() failures · 813e6930
      Tejun Heo authored
      When a gendisk is successfully initialized but add_disk() fails such as when
      a loop device has invalid number of minor device numbers specified,
      blkcg_init_disk() is called during init and then blkcg_exit_disk() during
      error handling. Unfortunately, iolatency gets initialized in the former but
      doesn't get cleaned up in the latter.
      
      This is because, in non-error cases, the cleanup is performed by
      del_gendisk() calling rq_qos_exit(), the assumption being that rq_qos
      policies, iolatency being one of them, can only be activated once the disk
      is fully registered and visible. That assumption is true for wbt and iocost,
      but not so for iolatency as it gets initialized before add_disk() is called.
      
      It is desirable to lazy-init rq_qos policies because they are optional
      features and add to hot path overhead once initialized - each IO has to walk
      all the registered rq_qos policies. So, we want to switch iolatency to lazy
      init too. However, that's a bigger change. As a fix for the immediate
      problem, let's just add an extra call to rq_qos_exit() in blkcg_exit_disk().
      This is safe because duplicate calls to rq_qos_exit() become noop's.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: darklight2357@icloud.com
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Fixes: d7067512 ("block: introduce blk-iolatency io controller")
      Cc: stable@vger.kernel.org # v4.19+
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/Y5TQ5gm3O4HXrXR3@slm.duckdns.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      813e6930
    • Isaac J. Manjarres's avatar
      loop: Fix the max_loop commandline argument treatment when it is set to 0 · 85c50197
      Isaac J. Manjarres authored
      Currently, the max_loop commandline argument can be used to specify how
      many loop block devices are created at init time. If it is not
      specified on the commandline, CONFIG_BLK_DEV_LOOP_MIN_COUNT loop block
      devices will be created.
      
      The max_loop commandline argument can be used to override the value of
      CONFIG_BLK_DEV_LOOP_MIN_COUNT. However, when max_loop is set to 0
      through the commandline, the current logic treats it as if it had not
      been set, and creates CONFIG_BLK_DEV_LOOP_MIN_COUNT devices anyway.
      
      Fix this by starting max_loop off as set to CONFIG_BLK_DEV_LOOP_MIN_COUNT.
      This preserves the intended behavior of creating
      CONFIG_BLK_DEV_LOOP_MIN_COUNT loop block devices if the max_loop
      commandline parameter is not specified, and allowing max_loop to
      be respected for all values, including 0.
      
      This allows environments that can create all of their required loop
      block devices on demand to not have to unnecessarily preallocate loop
      block devices.
      
      Fixes: 73285082 ("remove artificial software max_loop limit")
      Cc: stable@vger.kernel.org
      Cc: Ken Chen <kenchen@google.com>
      Signed-off-by: default avatarIsaac J. Manjarres <isaacmanjarres@google.com>
      Link: https://lore.kernel.org/r/20221208212902.765781-1-isaacmanjarres@google.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      85c50197
    • Jiri Slaby (SUSE)'s avatar
      block/blk-iocost (gcc13): keep large values in a new enum · ff1cc97b
      Jiri Slaby (SUSE) authored
      Since gcc13, each member of an enum has the same type as the enum [1]. And
      that is inherited from its members. Provided:
        VTIME_PER_SEC_SHIFT     = 37,
        VTIME_PER_SEC           = 1LLU << VTIME_PER_SEC_SHIFT,
        ...
        AUTOP_CYCLE_NSEC        = 10LLU * NSEC_PER_SEC,
      the named type is unsigned long.
      
      This generates warnings with gcc-13:
        block/blk-iocost.c: In function 'ioc_weight_prfill':
        block/blk-iocost.c:3037:37: error: format '%u' expects argument of type 'unsigned int', but argument 4 has type 'long unsigned int'
      
        block/blk-iocost.c: In function 'ioc_weight_show':
        block/blk-iocost.c:3047:34: error: format '%u' expects argument of type 'unsigned int', but argument 3 has type 'long unsigned int'
      
      So split the anonymous enum with large values to a separate enum, so
      that they don't affect other members.
      
      [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36113
      
      Cc: Martin Liska <mliska@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: cgroups@vger.kernel.org
      Cc: linux-block@vger.kernel.org
      Signed-off-by: default avatarJiri Slaby (SUSE) <jirislaby@kernel.org>
      Link: https://lore.kernel.org/r/20221213120826.17446-1-jirislaby@kernel.orgSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ff1cc97b
    • Yu Kuai's avatar
      block, bfq: replace 0/1 with false/true in bic apis · 337366e0
      Yu Kuai authored
      Just to make the code a litter cleaner, there are no functional changes.
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20221214033155.3455754-3-yukuai1@huaweicloud.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      337366e0
    • Yu Kuai's avatar
      block, bfq: don't return bfqg from __bfq_bic_change_cgroup() · 452af7dc
      Yu Kuai authored
      The return value is not used, hence remove it.
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20221214033155.3455754-2-yukuai1@huaweicloud.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      452af7dc
    • Yu Kuai's avatar
      block, bfq: fix possible uaf for 'bfqq->bic' · 64dc8c73
      Yu Kuai authored
      Our test report a uaf for 'bfqq->bic' in 5.10:
      
      ==================================================================
      BUG: KASAN: use-after-free in bfq_select_queue+0x378/0xa30
      
      CPU: 6 PID: 2318352 Comm: fsstress Kdump: loaded Not tainted 5.10.0-60.18.0.50.h602.kasan.eulerosv2r11.x86_64 #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-20220320_160524-szxrtosci10000 04/01/2014
      Call Trace:
       bfq_select_queue+0x378/0xa30
       bfq_dispatch_request+0xe8/0x130
       blk_mq_do_dispatch_sched+0x62/0xb0
       __blk_mq_sched_dispatch_requests+0x215/0x2a0
       blk_mq_sched_dispatch_requests+0x8f/0xd0
       __blk_mq_run_hw_queue+0x98/0x180
       __blk_mq_delay_run_hw_queue+0x22b/0x240
       blk_mq_run_hw_queue+0xe3/0x190
       blk_mq_sched_insert_requests+0x107/0x200
       blk_mq_flush_plug_list+0x26e/0x3c0
       blk_finish_plug+0x63/0x90
       __iomap_dio_rw+0x7b5/0x910
       iomap_dio_rw+0x36/0x80
       ext4_dio_read_iter+0x146/0x190 [ext4]
       ext4_file_read_iter+0x1e2/0x230 [ext4]
       new_sync_read+0x29f/0x400
       vfs_read+0x24e/0x2d0
       ksys_read+0xd5/0x1b0
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x61/0xc6
      
      Commit 3bc5e683 ("bfq: Split shared queues on move between cgroups")
      changes that move process to a new cgroup will allocate a new bfqq to
      use, however, the old bfqq and new bfqq can point to the same bic:
      
      1) Initial state, two process with io in the same cgroup.
      
      Process 1       Process 2
       (BIC1)          (BIC2)
        |  Λ            |  Λ
        |  |            |  |
        V  |            V  |
        bfqq1           bfqq2
      
      2) bfqq1 is merged to bfqq2.
      
      Process 1       Process 2
       (BIC1)          (BIC2)
        |               |
         \-------------\|
                        V
        bfqq1           bfqq2(coop)
      
      3) Process 1 exit, then issue new io(denoce IOA) from Process 2.
      
       (BIC2)
        |  Λ
        |  |
        V  |
        bfqq2(coop)
      
      4) Before IOA is completed, move Process 2 to another cgroup and issue io.
      
      Process 2
       (BIC2)
         Λ
         |\--------------\
         |                V
        bfqq2           bfqq3
      
      Now that BIC2 points to bfqq3, while bfqq2 and bfqq3 both point to BIC2.
      If all the requests are completed, and Process 2 exit, BIC2 will be
      freed while there is no guarantee that bfqq2 will be freed before BIC2.
      
      Fix the problem by clearing bfqq->bic while bfqq is detached from bic.
      
      Fixes: 3bc5e683 ("bfq: Split shared queues on move between cgroups")
      Suggested-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20221214030430.3304151-1-yukuai1@huaweicloud.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      64dc8c73
    • Linus Torvalds's avatar
      Merge tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · e2ca6ba6
      Linus Torvalds authored
      Pull MM updates from Andrew Morton:
      
       - More userfaultfs work from Peter Xu
      
       - Several convert-to-folios series from Sidhartha Kumar and Huang Ying
      
       - Some filemap cleanups from Vishal Moola
      
       - David Hildenbrand added the ability to selftest anon memory COW
         handling
      
       - Some cpuset simplifications from Liu Shixin
      
       - Addition of vmalloc tracing support by Uladzislau Rezki
      
       - Some pagecache folioifications and simplifications from Matthew
         Wilcox
      
       - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use
         it
      
       - Miguel Ojeda contributed some cleanups for our use of the
         __no_sanitize_thread__ gcc keyword.
      
         This series should have been in the non-MM tree, my bad
      
       - Naoya Horiguchi improved the interaction between memory poisoning and
         memory section removal for huge pages
      
       - DAMON cleanups and tuneups from SeongJae Park
      
       - Tony Luck fixed the handling of COW faults against poisoned pages
      
       - Peter Xu utilized the PTE marker code for handling swapin errors
      
       - Hugh Dickins reworked compound page mapcount handling, simplifying it
         and making it more efficient
      
       - Removal of the autonuma savedwrite infrastructure from Nadav Amit and
         David Hildenbrand
      
       - zram support for multiple compression streams from Sergey Senozhatsky
      
       - David Hildenbrand reworked the GUP code's R/O long-term pinning so
         that drivers no longer need to use the FOLL_FORCE workaround which
         didn't work very well anyway
      
       - Mel Gorman altered the page allocator so that local IRQs can remnain
         enabled during per-cpu page allocations
      
       - Vishal Moola removed the try_to_release_page() wrapper
      
       - Stefan Roesch added some per-BDI sysfs tunables which are used to
         prevent network block devices from dirtying excessive amounts of
         pagecache
      
       - David Hildenbrand did some cleanup and repair work on KSM COW
         breaking
      
       - Nhat Pham and Johannes Weiner have implemented writeback in zswap's
         zsmalloc backend
      
       - Brian Foster has fixed a longstanding corner-case oddity in
         file[map]_write_and_wait_range()
      
       - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
         Chen
      
       - Shiyang Ruan has done some work on fsdax, to make its reflink mode
         work better under xfstests. Better, but still not perfect
      
       - Christoph Hellwig has removed the .writepage() method from several
         filesystems. They only need .writepages()
      
       - Yosry Ahmed wrote a series which fixes the memcg reclaim target
         beancounting
      
       - David Hildenbrand has fixed some of our MM selftests for 32-bit
         machines
      
       - Many singleton patches, as usual
      
      * tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits)
        mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio
        mm: mmu_gather: allow more than one batch of delayed rmaps
        mm: fix typo in struct pglist_data code comment
        kmsan: fix memcpy tests
        mm: add cond_resched() in swapin_walk_pmd_entry()
        mm: do not show fs mm pc for VM_LOCKONFAULT pages
        selftests/vm: ksm_functional_tests: fixes for 32bit
        selftests/vm: cow: fix compile warning on 32bit
        selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions
        mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem
        mm,thp,rmap: fix races between updates of subpages_mapcount
        mm: memcg: fix swapcached stat accounting
        mm: add nodes= arg to memory.reclaim
        mm: disable top-tier fallback to reclaim on proactive reclaim
        selftests: cgroup: make sure reclaim target memcg is unprotected
        selftests: cgroup: refactor proactive reclaim code to reclaim_until()
        mm: memcg: fix stale protection of reclaim target memcg
        mm/mmap: properly unaccount memory on mas_preallocate() failure
        omfs: remove ->writepage
        jfs: remove ->writepage
        ...
      e2ca6ba6
  13. 13 Dec, 2022 2 commits
    • Linus Torvalds's avatar
      Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · 7e68dd7d
      Linus Torvalds authored
      Pull networking updates from Paolo Abeni:
       "Core:
      
         - Allow live renaming when an interface is up
      
         - Add retpoline wrappers for tc, improving considerably the
           performances of complex queue discipline configurations
      
         - Add inet drop monitor support
      
         - A few GRO performance improvements
      
         - Add infrastructure for atomic dev stats, addressing long standing
           data races
      
         - De-duplicate common code between OVS and conntrack offloading
           infrastructure
      
         - A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements
      
         - Netfilter: introduce packet parser for tunneled packets
      
         - Replace IPVS timer-based estimators with kthreads to scale up the
           workload with the number of available CPUs
      
         - Add the helper support for connection-tracking OVS offload
      
        BPF:
      
         - Support for user defined BPF objects: the use case is to allocate
           own objects, build own object hierarchies and use the building
           blocks to build own data structures flexibly, for example, linked
           lists in BPF
      
         - Make cgroup local storage available to non-cgroup attached BPF
           programs
      
         - Avoid unnecessary deadlock detection and failures wrt BPF task
           storage helpers
      
         - A relevant bunch of BPF verifier fixes and improvements
      
         - Veristat tool improvements to support custom filtering, sorting,
           and replay of results
      
         - Add LLVM disassembler as default library for dumping JITed code
      
         - Lots of new BPF documentation for various BPF maps
      
         - Add bpf_rcu_read_{,un}lock() support for sleepable programs
      
         - Add RCU grace period chaining to BPF to wait for the completion of
           access from both sleepable and non-sleepable BPF programs
      
         - Add support storing struct task_struct objects as kptrs in maps
      
         - Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
           values
      
         - Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions
      
        Protocols:
      
         - TCP: implement Protective Load Balancing across switch links
      
         - TCP: allow dynamically disabling TCP-MD5 static key, reverting back
           to fast[er]-path
      
         - UDP: Introduce optional per-netns hash lookup table
      
         - IPv6: simplify and cleanup sockets disposal
      
         - Netlink: support different type policies for each generic netlink
           operation
      
         - MPTCP: add MSG_FASTOPEN and FastOpen listener side support
      
         - MPTCP: add netlink notification support for listener sockets events
      
         - SCTP: add VRF support, allowing sctp sockets binding to VRF devices
      
         - Add bridging MAC Authentication Bypass (MAB) support
      
         - Extensions for Ethernet VPN bridging implementation to better
           support multicast scenarios
      
         - More work for Wi-Fi 7 support, comprising conversion of all the
           existing drivers to internal TX queue usage
      
         - IPSec: introduce a new offload type (packet offload) allowing
           complete header processing and crypto offloading
      
         - IPSec: extended ack support for more descriptive XFRM error
           reporting
      
         - RXRPC: increase SACK table size and move processing into a
           per-local endpoint kernel thread, reducing considerably the
           required locking
      
         - IEEE 802154: synchronous send frame and extended filtering support,
           initial support for scanning available 15.4 networks
      
         - Tun: bump the link speed from 10Mbps to 10Gbps
      
         - Tun/VirtioNet: implement UDP segmentation offload support
      
        Driver API:
      
         - PHY/SFP: improve power level switching between standard level 1 and
           the higher power levels
      
         - New API for netdev <-> devlink_port linkage
      
         - PTP: convert existing drivers to new frequency adjustment
           implementation
      
         - DSA: add support for rx offloading
      
         - Autoload DSA tagging driver when dynamically changing protocol
      
         - Add new PCP and APPTRUST attributes to Data Center Bridging
      
         - Add configuration support for 800Gbps link speed
      
         - Add devlink port function attribute to enable/disable RoCE and
           migratable
      
         - Extend devlink-rate to support strict prioriry and weighted fair
           queuing
      
         - Add devlink support to directly reading from region memory
      
         - New device tree helper to fetch MAC address from nvmem
      
         - New big TCP helper to simplify temporary header stripping
      
        New hardware / drivers:
      
         - Ethernet:
            - Marvel Octeon CNF95N and CN10KB Ethernet Switches
            - Marvel Prestera AC5X Ethernet Switch
            - WangXun 10 Gigabit NIC
            - Motorcomm yt8521 Gigabit Ethernet
            - Microchip ksz9563 Gigabit Ethernet Switch
            - Microsoft Azure Network Adapter
            - Linux Automation 10Base-T1L adapter
      
         - PHY:
            - Aquantia AQR112 and AQR412
            - Motorcomm YT8531S
      
         - PTP:
            - Orolia ART-CARD
      
         - WiFi:
            - MediaTek Wi-Fi 7 (802.11be) devices
            - RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
              devices
      
         - Bluetooth:
            - Broadcom BCM4377/4378/4387 Bluetooth chipsets
            - Realtek RTL8852BE and RTL8723DS
            - Cypress.CYW4373A0 WiFi + Bluetooth combo device
      
        Drivers:
      
         - CAN:
            - gs_usb: bus error reporting support
            - kvaser_usb: listen only and bus error reporting support
      
         - Ethernet NICs:
            - Intel (100G):
               - extend action skbedit to RX queue mapping
               - implement devlink-rate support
               - support direct read from memory
            - nVidia/Mellanox (mlx5):
               - SW steering improvements, increasing rules update rate
               - Support for enhanced events compression
               - extend H/W offload packet manipulation capabilities
               - implement IPSec packet offload mode
            - nVidia/Mellanox (mlx4):
               - better big TCP support
            - Netronome Ethernet NICs (nfp):
               - IPsec offload support
               - add support for multicast filter
            - Broadcom:
               - RSS and PTP support improvements
            - AMD/SolarFlare:
               - netlink extened ack improvements
               - add basic flower matches to offload, and related stats
            - Virtual NICs:
               - ibmvnic: introduce affinity hint support
            - small / embedded:
               - FreeScale fec: add initial XDP support
               - Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood
               - TI am65-cpsw: add suspend/resume support
               - Mediatek MT7986: add RX wireless wthernet dispatch support
               - Realtek 8169: enable GRO software interrupt coalescing per
                 default
      
         - Ethernet high-speed switches:
            - Microchip (sparx5):
               - add support for Sparx5 TC/flower H/W offload via VCAP
            - Mellanox mlxsw:
               - add 802.1X and MAC Authentication Bypass offload support
               - add ip6gre support
      
         - Embedded Ethernet switches:
            - Mediatek (mtk_eth_soc):
               - improve PCS implementation, add DSA untag support
               - enable flow offload support
            - Renesas:
               - add rswitch R-Car Gen4 gPTP support
            - Microchip (lan966x):
               - add full XDP support
               - add TC H/W offload via VCAP
               - enable PTP on bridge interfaces
            - Microchip (ksz8):
               - add MTU support for KSZ8 series
      
         - Qualcomm 802.11ax WiFi (ath11k):
            - support configuring channel dwell time during scan
      
         - MediaTek WiFi (mt76):
            - enable Wireless Ethernet Dispatch (WED) offload support
            - add ack signal support
            - enable coredump support
            - remain_on_channel support
      
         - Intel WiFi (iwlwifi):
            - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities
            - 320 MHz channels support
      
         - RealTek WiFi (rtw89):
            - new dynamic header firmware format support
            - wake-over-WLAN support"
      
      * tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits)
        ipvs: fix type warning in do_div() on 32 bit
        net: lan966x: Remove a useless test in lan966x_ptp_add_trap()
        net: ipa: add IPA v4.7 support
        dt-bindings: net: qcom,ipa: Add SM6350 compatible
        bnxt: Use generic HBH removal helper in tx path
        IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
        selftests: forwarding: Add bridge MDB test
        selftests: forwarding: Rename bridge_mdb test
        bridge: mcast: Support replacement of MDB port group entries
        bridge: mcast: Allow user space to specify MDB entry routing protocol
        bridge: mcast: Allow user space to add (*, G) with a source list and filter mode
        bridge: mcast: Add support for (*, G) with a source list and filter mode
        bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source
        bridge: mcast: Add a flag for user installed source entries
        bridge: mcast: Expose __br_multicast_del_group_src()
        bridge: mcast: Expose br_multicast_new_group_src()
        bridge: mcast: Add a centralized error path
        bridge: mcast: Place netlink policy before validation functions
        bridge: mcast: Split (*, G) and (S, G) addition into different functions
        bridge: mcast: Do not derive entry type from its filter mode
        ...
      7e68dd7d
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20221213' of https://github.com/jcmvbkbc/linux-xtensa · 1ca06f1c
      Linus Torvalds authored
      Pull Xtensa updates from Max Filippov:
      
       - fix kernel build with gcc-13
      
       - various minor fixes
      
      * tag 'xtensa-20221213' of https://github.com/jcmvbkbc/linux-xtensa:
        xtensa: add __umulsidi3 helper
        xtensa: update config files
        MAINTAINERS: update the 'T:' entry for xtensa
      1ca06f1c