• David Howells's avatar
    netfs: Fix write oops in generic/346 (9p) and generic/074 (cifs) · df9b4556
    David Howells authored
    In netfslib, a buffered writeback operation has a 'write queue' of folios
    that are being written, held in a linear sequence of folio_queue structs.
    The 'issuer' adds new folio_queues on the leading edge of the queue and
    populates each one progressively; the 'collector' pops them off the
    trailing edge and discards them and the folios they point to as they are
    consumed.
    
    The queue is required to always retain at least one folio_queue structure.
    This allows the queue to be accessed without locking and with just a bit of
    barriering.
    
    When a new subrequest is prepared, its ->io_iter iterator is pointed at the
    current end of the write queue and then the iterator is extended as more
    data is added to the queue until the subrequest is committed.
    
    Now, the problem is that the folio_queue at the leading edge of the write
    queue when a subrequest is prepared might have been entirely consumed - but
    not yet removed from the queue as it is the only remaining one and is
    preventing the queue from collapsing.
    
    So, what happens is that subreq->io_iter is pointed at the spent
    folio_queue, then a new folio_queue is added, and, at that point, the
    collector is at entirely at liberty to immediately delete the spent
    folio_queue.
    
    This leaves the subreq->io_iter pointing at a freed object.  If the system
    is lucky, iterate_folioq() sees ->io_iter, sees the as-yet uncorrupted
    freed object and advances to the next folio_queue in the queue.
    
    In the case seen, however, the freed object gets recycled and put back onto
    the queue at the tail and filled to the end.  This confuses
    iterate_folioq() and it tries to step ->next, which may be NULL - resulting
    in an oops.
    
    Fix this by the following means:
    
     (1) When preparing a write subrequest, make sure there's a folio_queue
         struct with space in it at the leading edge of the queue.  A function
         to make space is split out of the function to append a folio so that
         it can be called for this purpose.
    
     (2) If the request struct iterator is pointing to a completely spent
         folio_queue when we make space, then advance the iterator to the newly
         allocated folio_queue.  The subrequest's iterator will then be set
         from this.
    
    The oops could be triggered using the generic/346 xfstest with a filesystem
    on9P over TCP with cache=loose.  The oops looked something like:
    
     BUG: kernel NULL pointer dereference, address: 0000000000000008
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     ...
     RIP: 0010:_copy_from_iter+0x2db/0x530
     ...
     Call Trace:
      <TASK>
     ...
      p9pdu_vwritef+0x3d8/0x5d0
      p9_client_prepare_req+0xa8/0x140
      p9_client_rpc+0x81/0x280
      p9_client_write+0xcf/0x1c0
      v9fs_issue_write+0x87/0xc0
      netfs_advance_write+0xa0/0xb0
      netfs_write_folio.isra.0+0x42d/0x500
      netfs_writepages+0x15a/0x1f0
      do_writepages+0xd1/0x220
      filemap_fdatawrite_wbc+0x5c/0x80
      v9fs_mmap_vm_close+0x7d/0xb0
      remove_vma+0x35/0x70
      vms_complete_munmap_vmas+0x11a/0x170
      do_vmi_align_munmap+0x17d/0x1c0
      do_vmi_munmap+0x13e/0x150
      __vm_munmap+0x92/0xd0
      __x64_sys_munmap+0x17/0x20
      do_syscall_64+0x80/0xe0
      entry_SYSCALL_64_after_hwframe+0x71/0x79
    
    This also fixed a similar-looking issue with cifs and generic/074.
    
    Fixes: cd0277ed ("netfs: Use new folio_queue data type and iterator instead of xarray iter")
    Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
    Closes: https://lore.kernel.org/oe-lkp/202409180928.f20b5a08-oliver.sang@intel.com
    Closes: https://lore.kernel.org/oe-lkp/202409131438.3f225fbf-oliver.sang@intel.comSigned-off-by: default avatarDavid Howells <dhowells@redhat.com>
    Tested-by: default avatarkernel test robot <oliver.sang@intel.com>
    cc: Eric Van Hensbergen <ericvh@kernel.org>
    cc: Latchesar Ionkov <lucho@ionkov.net>
    cc: Dominique Martinet <asmadeus@codewreck.org>
    cc: Christian Schoenebeck <linux_oss@crudebyte.com>
    cc: Paulo Alcantara <pc@manguebit.com>
    cc: Jeff Layton <jlayton@kernel.org>
    cc: v9fs@lists.linux.dev
    cc: linux-cifs@vger.kernel.org
    cc: netfs@lists.linux.dev
    cc: linux-fsdevel@vger.kernel.org
    Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
    df9b4556
write_issue.c 20 KB