1. 15 Feb, 2018 16 commits
    • Leon Romanovsky's avatar
      RDMA/uverbs: Fix bad unlock balance in ib_uverbs_close_xrcd · 5c2e1c4f
      Leon Romanovsky authored
      There is no matching lock for this mutex. Git history suggests this is
      just a missed remnant from an earlier version of the function before
      this locking was moved into uverbs_free_xrcd.
      
      Originally this lock was protecting the xrcd_table_delete()
      
      =====================================
      WARNING: bad unlock balance detected!
      4.15.0+ #87 Not tainted
      -------------------------------------
      syzkaller223405/269 is trying to release lock (&uverbs_dev->xrcd_tree_mutex) at:
      [<00000000b8703372>] ib_uverbs_close_xrcd+0x195/0x1f0
      but there are no more locks to release!
      
      other info that might help us debug this:
      1 lock held by syzkaller223405/269:
       #0:  (&uverbs_dev->disassociate_srcu){....}, at: [<000000005af3b960>] ib_uverbs_write+0x265/0xef0
      
      stack backtrace:
      CPU: 0 PID: 269 Comm: syzkaller223405 Not tainted 4.15.0+ #87
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      Call Trace:
       dump_stack+0xde/0x164
       ? dma_virt_map_sg+0x22c/0x22c
       ? ib_uverbs_write+0x265/0xef0
       ? console_unlock+0x502/0xbd0
       ? ib_uverbs_close_xrcd+0x195/0x1f0
       print_unlock_imbalance_bug+0x131/0x160
       lock_release+0x59d/0x1100
       ? ib_uverbs_close_xrcd+0x195/0x1f0
       ? lock_acquire+0x440/0x440
       ? lock_acquire+0x440/0x440
       __mutex_unlock_slowpath+0x88/0x670
       ? wait_for_completion+0x4c0/0x4c0
       ? rdma_lookup_get_uobject+0x145/0x2f0
       ib_uverbs_close_xrcd+0x195/0x1f0
       ? ib_uverbs_open_xrcd+0xdd0/0xdd0
       ib_uverbs_write+0x7f9/0xef0
       ? cyc2ns_read_end+0x10/0x10
       ? ib_uverbs_open_xrcd+0xdd0/0xdd0
       ? uverbs_devnode+0x110/0x110
       ? cyc2ns_read_end+0x10/0x10
       ? cyc2ns_read_end+0x10/0x10
       ? sched_clock_cpu+0x18/0x200
       __vfs_write+0x10d/0x700
       ? uverbs_devnode+0x110/0x110
       ? kernel_read+0x170/0x170
       ? __fget+0x358/0x5d0
       ? security_file_permission+0x93/0x260
       vfs_write+0x1b0/0x550
       SyS_write+0xc7/0x1a0
       ? SyS_read+0x1a0/0x1a0
       ? trace_hardirqs_on_thunk+0x1a/0x1c
       entry_SYSCALL_64_fastpath+0x1e/0x8b
      RIP: 0033:0x4335c9
      
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: <stable@vger.kernel.org> # 4.11
      Fixes: fd3c7904 ("IB/core: Change idr objects to use the new schema")
      Reported-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      5c2e1c4f
    • Leon Romanovsky's avatar
      RDMA/restrack: Increment CQ restrack object before committing · 0cba0efc
      Leon Romanovsky authored
      Once the uobj is committed it is immediately possible another thread
      could destroy it, which worst case, can result in a use-after-free
      of the restrack objects.
      
      Cc: syzkaller <syzkaller@googlegroups.com>
      Fixes: 08f294a1 ("RDMA/core: Add resource tracking for create and destroy CQs")
      Reported-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      0cba0efc
    • Leon Romanovsky's avatar
      RDMA/uverbs: Protect from command mask overflow · 3f802b16
      Leon Romanovsky authored
      The command number is not bounds checked against the command mask before it
      is shifted, resulting in an ubsan hit. This does not cause malfunction since
      the command number is eventually bounds checked, but we can make this ubsan
      clean by moving the bounds check to before the mask check.
      
      ================================================================================
      UBSAN: Undefined behaviour in
      drivers/infiniband/core/uverbs_main.c:647:21
      shift exponent 207 is too large for 64-bit type 'long long unsigned int'
      CPU: 0 PID: 446 Comm: syz-executor3 Not tainted 4.15.0-rc2+ #61
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      Call Trace:
      dump_stack+0xde/0x164
      ? dma_virt_map_sg+0x22c/0x22c
      ubsan_epilogue+0xe/0x81
      __ubsan_handle_shift_out_of_bounds+0x293/0x2f7
      ? debug_check_no_locks_freed+0x340/0x340
      ? __ubsan_handle_load_invalid_value+0x19b/0x19b
      ? lock_acquire+0x440/0x440
      ? lock_acquire+0x19d/0x440
      ? __might_fault+0xf4/0x240
      ? ib_uverbs_write+0x68d/0xe20
      ib_uverbs_write+0x68d/0xe20
      ? __lock_acquire+0xcf7/0x3940
      ? uverbs_devnode+0x110/0x110
      ? cyc2ns_read_end+0x10/0x10
      ? sched_clock_cpu+0x18/0x200
      ? sched_clock_cpu+0x18/0x200
      __vfs_write+0x10d/0x700
      ? uverbs_devnode+0x110/0x110
      ? kernel_read+0x170/0x170
      ? __fget+0x35b/0x5d0
      ? security_file_permission+0x93/0x260
      vfs_write+0x1b0/0x550
      SyS_write+0xc7/0x1a0
      ? SyS_read+0x1a0/0x1a0
      ? trace_hardirqs_on_thunk+0x1a/0x1c
      entry_SYSCALL_64_fastpath+0x18/0x85
      RIP: 0033:0x448e29
      RSP: 002b:00007f033f567c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 00007f033f5686bc RCX: 0000000000448e29
      RDX: 0000000000000060 RSI: 0000000020001000 RDI: 0000000000000012
      RBP: 000000000070bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000056a0 R14: 00000000006e8740 R15: 0000000000000000
      ================================================================================
      
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: <stable@vger.kernel.org> # 4.5
      Fixes: 2dbd5186 ("IB/core: IB/core: Allow legacy verbs through extended interfaces")
      Reported-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Reviewed-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      3f802b16
    • Jason Gunthorpe's avatar
      IB/uverbs: Fix unbalanced unlock on error path for rdma_explicit_destroy · ec6f8401
      Jason Gunthorpe authored
      If remove_commit fails then the lock is left locked while the uobj still
      exists. Eventually the kernel will deadlock.
      
      lockdep detects this and says:
      
       test/4221 is leaving the kernel with locks still held!
       1 lock held by test/4221:
        #0:  (&ucontext->cleanup_rwsem){.+.+}, at: [<000000001e5c7523>] rdma_explicit_destroy+0x37/0x120 [ib_uverbs]
      
      Fixes: 4da70da2 ("IB/core: Explicitly destroy an object while keeping uobject")
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      ec6f8401
    • Jason Gunthorpe's avatar
      IB/uverbs: Improve lockdep_check · 104f268d
      Jason Gunthorpe authored
      This is really being used as an assert that the expected usecnt
      is being held and implicitly that the usecnt is valid. Rename it to
      assert_uverbs_usecnt and tighten the checks to only accept valid
      values of usecnt (eg 0 and < -1 are invalid).
      
      The tigher checkes make the assertion cover more cases and is more
      likely to find bugs via syzkaller/etc.
      
      Fixes: 38321256 ("IB/core: Add support for idr types")
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      104f268d
    • Leon Romanovsky's avatar
      RDMA/uverbs: Protect from races between lookup and destroy of uobjects · 6623e3e3
      Leon Romanovsky authored
      The race is between lookup_get_idr_uobject and
      uverbs_idr_remove_uobj -> uverbs_uobject_put.
      
      We deliberately do not call sychronize_rcu after the idr_remove in
      uverbs_idr_remove_uobj for performance reasons, instead we call
      kfree_rcu() during uverbs_uobject_put.
      
      However, this means we can obtain pointers to uobj's that have
      already been released and must protect against krefing them
      using kref_get_unless_zero.
      
      ==================================================================
      BUG: KASAN: use-after-free in copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
      Read of size 4 at addr ffff88005fda1ac8 by task syz-executor2/441
      
      CPU: 1 PID: 441 Comm: syz-executor2 Not tainted 4.15.0-rc2+ #56
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
      Call Trace:
      dump_stack+0x8d/0xd4
      print_address_description+0x73/0x290
      kasan_report+0x25c/0x370
      ? copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
      copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
      ? uverbs_try_lock_object+0x68/0xc0
      ? modify_qp.isra.7+0xdc4/0x10e0
      modify_qp.isra.7+0xdc4/0x10e0
      ib_uverbs_modify_qp+0xfe/0x170
      ? ib_uverbs_query_qp+0x970/0x970
      ? __lock_acquire+0xa11/0x1da0
      ib_uverbs_write+0x55a/0xad0
      ? ib_uverbs_query_qp+0x970/0x970
      ? ib_uverbs_query_qp+0x970/0x970
      ? ib_uverbs_open+0x760/0x760
      ? futex_wake+0x147/0x410
      ? sched_clock_cpu+0x18/0x180
      ? check_prev_add+0x1680/0x1680
      ? do_futex+0x3b6/0xa30
      ? sched_clock_cpu+0x18/0x180
      __vfs_write+0xf7/0x5c0
      ? ib_uverbs_open+0x760/0x760
      ? kernel_read+0x110/0x110
      ? lock_acquire+0x370/0x370
      ? __fget+0x264/0x3b0
      vfs_write+0x18a/0x460
      SyS_write+0xc7/0x1a0
      ? SyS_read+0x1a0/0x1a0
      ? trace_hardirqs_on_thunk+0x1a/0x1c
      entry_SYSCALL_64_fastpath+0x18/0x85
      RIP: 0033:0x448e29
      RSP: 002b:00007f443fee0c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 00007f443fee16bc RCX: 0000000000448e29
      RDX: 0000000000000078 RSI: 00000000209f8000 RDI: 0000000000000012
      RBP: 000000000070bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 0000000000008e98 R14: 00000000006ebf38 R15: 0000000000000000
      
      Allocated by task 1:
      kmem_cache_alloc_trace+0x16c/0x2f0
      mlx5_alloc_cmd_msg+0x12e/0x670
      cmd_exec+0x419/0x1810
      mlx5_cmd_exec+0x40/0x70
      mlx5_core_mad_ifc+0x187/0x220
      mlx5_MAD_IFC+0xd7/0x1b0
      mlx5_query_mad_ifc_gids+0x1f3/0x650
      mlx5_ib_query_gid+0xa4/0xc0
      ib_query_gid+0x152/0x1a0
      ib_query_port+0x21e/0x290
      mlx5_port_immutable+0x30f/0x490
      ib_register_device+0x5dd/0x1130
      mlx5_ib_add+0x3e7/0x700
      mlx5_add_device+0x124/0x510
      mlx5_register_interface+0x11f/0x1c0
      mlx5_ib_init+0x56/0x61
      do_one_initcall+0xa3/0x250
      kernel_init_freeable+0x309/0x3b8
      kernel_init+0x14/0x180
      ret_from_fork+0x24/0x30
      
      Freed by task 1:
      kfree+0xeb/0x2f0
      mlx5_free_cmd_msg+0xcd/0x140
      cmd_exec+0xeba/0x1810
      mlx5_cmd_exec+0x40/0x70
      mlx5_core_mad_ifc+0x187/0x220
      mlx5_MAD_IFC+0xd7/0x1b0
      mlx5_query_mad_ifc_gids+0x1f3/0x650
      mlx5_ib_query_gid+0xa4/0xc0
      ib_query_gid+0x152/0x1a0
      ib_query_port+0x21e/0x290
      mlx5_port_immutable+0x30f/0x490
      ib_register_device+0x5dd/0x1130
      mlx5_ib_add+0x3e7/0x700
      mlx5_add_device+0x124/0x510
      mlx5_register_interface+0x11f/0x1c0
      mlx5_ib_init+0x56/0x61
      do_one_initcall+0xa3/0x250
      kernel_init_freeable+0x309/0x3b8
      kernel_init+0x14/0x180
      ret_from_fork+0x24/0x30
      
      The buggy address belongs to the object at ffff88005fda1ab0
      which belongs to the cache kmalloc-32 of size 32
      The buggy address is located 24 bytes inside of
      32-byte region [ffff88005fda1ab0, ffff88005fda1ad0)
      The buggy address belongs to the page:
      page:00000000d5655c19 count:1 mapcount:0 mapping: (null)
      index:0xffff88005fda1fc0
      flags: 0x4000000000000100(slab)
      raw: 4000000000000100 0000000000000000 ffff88005fda1fc0 0000000180550008
      raw: ffffea00017f6780 0000000400000004 ffff88006c803980 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
      ffff88005fda1980: fc fc fb fb fb fb fc fc fb fb fb fb fc fc fb fb
      ffff88005fda1a00: fb fb fc fc fb fb fb fb fc fc 00 00 00 00 fc fc
      ffff88005fda1a80: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb
      ffff88005fda1b00: fc fc 00 00 00 00 fc fc fb fb fb fb fc fc fb fb
      ffff88005fda1b80: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
      ==================================================================@
      
      Cc: syzkaller <syzkaller@googlegroups.com>
      Cc: <stable@vger.kernel.org> # 4.11
      Fixes: 38321256 ("IB/core: Add support for idr types")
      Reported-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      6623e3e3
    • Jason Gunthorpe's avatar
      IB/uverbs: Hold the uobj write lock after allocate · d9dc7a35
      Jason Gunthorpe authored
      This clarifies the design intention that time between allocate and
      commit has the uobj exclusive to the caller. We already guarantee
      this by delaying publishing the uobj pointer via idr_insert,
      fd_install, list_add, etc.
      
      Additionally holding the usecnt lock during this period provides
      extra clarity and more protection against future mistakes.
      
      Fixes: 38321256 ("IB/core: Add support for idr types")
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      d9dc7a35
    • Matan Barak's avatar
      IB/uverbs: Fix possible oops with duplicate ioctl attributes · 4d39a959
      Matan Barak authored
      If the same attribute is listed twice by the user in the ioctl attribute
      list then error unwind can cause the kernel to deref garbage.
      
      This happens when an object with WRITE access is sent twice. The second
      parse properly fails but corrupts the state required for the error unwind
      it triggers.
      
      Fixing this by making duplicates in the attribute list invalid. This is
      not something we need to support.
      
      The ioctl interface is currently recommended to be disabled in kConfig.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      4d39a959
    • Matan Barak's avatar
      IB/uverbs: Add ioctl support for 32bit processes · 9dfb2ff4
      Matan Barak authored
      32 bit processes running on a 64 bit kernel call compat_ioctl so that
      implementations can revise any structure layout issues. Point compat_ioctl
      at our normal ioctl because:
      
      - All our structures are designed to be the same on 32 and 64 bit, ie we
        use __aligned_u64 when required and are careful to manage padding.
      
      - Any pointers are stored in u64's and userspace is expected
        to prepare them properly.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      9dfb2ff4
    • Jason Gunthorpe's avatar
      IB/uverbs: Use __aligned_u64 for uapi headers · 5d2beb57
      Jason Gunthorpe authored
      This has no impact on the structure layout since these structs already
      have their u64s already properly aligned, but it does document that we
      have this requirement for 32 bit compatibility.
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      5d2beb57
    • Matan Barak's avatar
      IB/uverbs: Fix method merging in uverbs_ioctl_merge · 3d89459e
      Matan Barak authored
      Fix a bug in uverbs_ioctl_merge that looked at the object's iterator
      number instead of the method's iterator number when merging methods.
      
      While we're at it, make the uverbs_ioctl_merge code a bit more clear
      and faster.
      
      Fixes: 118620d3 ('IB/core: Add uverbs merge trees functionality')
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      3d89459e
    • Jason Gunthorpe's avatar
      IB/uverbs: Use u64_to_user_ptr() not a union · 2f36028c
      Jason Gunthorpe authored
      The union approach will get the endianness wrong sometimes if the kernel's
      pointer size is 32 bits resulting in EFAULTs when trying to copy to/from
      user.
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      2f36028c
    • Jason Gunthorpe's avatar
      IB/uverbs: Use inline data transfer for UHW_IN · 6c976c30
      Jason Gunthorpe authored
      The rule for the API is pointers less than 8 bytes are inlined into
      the .data field of the attribute. Fix the creation of the driver udata
      struct to follow this rule and point to the .data itself when the size
      is less than 8 bytes.
      
      Otherwise if the UHW struct is less than 8 bytes the driver will get
      EFAULT during copy_from_user.
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      6c976c30
    • Matan Barak's avatar
      IB/uverbs: Always use the attribute size provided by the user · 89d9e8d3
      Matan Barak authored
      This fixes several bugs around the copy_to/from user path:
       - copy_to used the user provided size of the attribute
         and could copy data beyond the end of the kernel buffer into
         userspace.
       - copy_from didn't know the size of the kernel buffer and
         could have left kernel memory unexpectedly un-initialized.
       - copy_from did not use the user length to determine if the
         attribute data is inlined or not.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      89d9e8d3
    • Leon Romanovsky's avatar
      RDMA/restrack: Remove unimplemented XRCD object · 415bb699
      Leon Romanovsky authored
      Resource tracking of XRCD objects is not implemented in current
      version of restrack and hence can be removed.
      
      Fixes: 02d8883f ("RDMA/restrack: Add general infrastructure to track RDMA resources")
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      415bb699
    • Alaa Hleihel's avatar
      IB/ipoib: Do not warn if IPoIB debugfs doesn't exist · 14fa91e0
      Alaa Hleihel authored
      netdev_wait_allrefs() could rebroadcast NETDEV_UNREGISTER event
      multiple times until all refs are gone, which will result in calling
      ipoib_delete_debug_files multiple times and printing a warning.
      
      Remove the WARN_ONCE since checks of NULL pointers before calling
      debugfs_remove are not needed.
      
      Fixes: 771a5258 ("IB/IPoIB: ibX: failed to create mcg debug file")
      Signed-off-by: default avatarAlaa Hleihel <alaa@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      14fa91e0
  2. 11 Feb, 2018 9 commits
    • Linus Torvalds's avatar
      Linux 4.16-rc1 · 7928b2cb
      Linus Torvalds authored
      7928b2cb
    • Al Viro's avatar
      unify {de,}mangle_poll(), get rid of kernel-side POLL... · 7a163b21
      Al Viro authored
      except, again, POLLFREE and POLL_BUSY_LOOP.
      
      With this, we finally get to the promised end result:
      
       - POLL{IN,OUT,...} are plain integers and *not* in __poll_t, so any
         stray instances of ->poll() still using those will be caught by
         sparse.
      
       - eventpoll.c and select.c warning-free wrt __poll_t
      
       - no more kernel-side definitions of POLL... - userland ones are
         visible through the entire kernel (and used pretty much only for
         mangle/demangle)
      
       - same behavior as after the first series (i.e. sparc et.al. epoll(2)
         working correctly).
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7a163b21
    • Linus Torvalds's avatar
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds authored
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
    • Linus Torvalds's avatar
      Merge branch 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · ee5daa13
      Linus Torvalds authored
      Pull more poll annotation updates from Al Viro:
       "This is preparation to solving the problems you've mentioned in the
        original poll series.
      
        After this series, the kernel is ready for running
      
            for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
                  L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
                  for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
            done
      
        as a for bulk search-and-replace.
      
        After that, the kernel is ready to apply the patch to unify
        {de,}mangle_poll(), and then get rid of kernel-side POLL... uses
        entirely, and we should be all done with that stuff.
      
        Basically, that's what you suggested wrt KPOLL..., except that we can
        use EPOLL... instead - they already are arch-independent (and equal to
        what is currently kernel-side POLL...).
      
        After the preparations (in this series) switch to returning EPOLL...
        from ->poll() instances is completely mechanical and kernel-side
        POLL... can go away. The last step (killing kernel-side POLL... and
        unifying {de,}mangle_poll() has to be done after the
        search-and-replace job, since we need userland-side POLL... for
        unified {de,}mangle_poll(), thus the cherry-pick at the last step.
      
        After that we will have:
      
         - POLL{IN,OUT,...} *not* in __poll_t, so any stray instances of
           ->poll() still using those will be caught by sparse.
      
         - eventpoll.c and select.c warning-free wrt __poll_t
      
         - no more kernel-side definitions of POLL... - userland ones are
           visible through the entire kernel (and used pretty much only for
           mangle/demangle)
      
         - same behavior as after the first series (i.e. sparc et.al. epoll(2)
           working correctly)"
      
      * 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        annotate ep_scan_ready_list()
        ep_send_events_proc(): return result via esed->res
        preparation to switching ->poll() to returning EPOLL...
        add EPOLLNVAL, annotate EPOLL... and event_poll->event
        use linux/poll.h instead of asm/poll.h
        xen: fix poll misannotation
        smc: missing poll annotations
      ee5daa13
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20180211' of git://github.com/jcmvbkbc/linux-xtensa · 3fc928dc
      Linus Torvalds authored
      Pull xtense fix from Max Filippov:
       "Build fix for xtensa architecture with KASAN enabled"
      
      * tag 'xtensa-20180211' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: fix build with KASAN
      3fc928dc
    • Linus Torvalds's avatar
      Merge tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 · 60d7a21a
      Linus Torvalds authored
      Pull nios2 update from Ley Foon Tan:
      
       - clean up old Kconfig options from defconfig
      
       - remove leading 0x and 0s from bindings notation in dts files
      
      * tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
        nios2: defconfig: Cleanup from old Kconfig options
        nios2: dts: Remove leading 0x and 0s from bindings notation
      60d7a21a
    • Max Filippov's avatar
      xtensa: fix build with KASAN · f8d0cbf2
      Max Filippov authored
      The commit 917538e2 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT
      usage") removed KASAN_SHADOW_SCALE_SHIFT definition from
      include/linux/kasan.h and added it to architecture-specific headers,
      except for xtensa. This broke the xtensa build with KASAN enabled.
      Define KASAN_SHADOW_SCALE_SHIFT in arch/xtensa/include/asm/kasan.h
      
      Reported by: kbuild test robot <fengguang.wu@intel.com>
      Fixes: 917538e2 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT usage")
      Acked-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      f8d0cbf2
    • Krzysztof Kozlowski's avatar
      nios2: defconfig: Cleanup from old Kconfig options · e0691ebb
      Krzysztof Kozlowski authored
      Remove old, dead Kconfig option INET_LRO. It is gone since
      commit 7bbf3cae ("ipv4: Remove inet_lro library").
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Acked-by: default avatarLey Foon Tan <ley.foon.tan@intel.com>
      e0691ebb
    • Mathieu Malaterre's avatar
      nios2: dts: Remove leading 0x and 0s from bindings notation · 5d13c731
      Mathieu Malaterre authored
      Improve the DTS files by removing all the leading "0x" and zeros to fix the
      following dtc warnings:
      
      Warning (unit_address_format): Node /XXX unit name should not have leading "0x"
      
      and
      
      Warning (unit_address_format): Node /XXX unit name should not have leading 0s
      
      Converted using the following command:
      
      find . -type f \( -iname *.dts -o -iname *.dtsi \) -exec sed -E -i -e "s/@0x([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" -e "s/@0+([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" {} +
      
      For simplicity, two sed expressions were used to solve each warnings separately.
      
      To make the regex expression more robust a few other issues were resolved,
      namely setting unit-address to lower case, and adding a whitespace before the
      the opening curly brace:
      
      https://elinux.org/Device_Tree_Linux#Linux_conventions
      
      This is a follow up to commit 4c9847b7 ("dt-bindings: Remove leading 0x from bindings notation")
      Reported-by: default avatarDavid Daney <ddaney@caviumnetworks.com>
      Suggested-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarMathieu Malaterre <malat@debian.org>
      Acked-by: default avatarLey Foon Tan <ley.foon.tan@intel.com>
      5d13c731
  3. 10 Feb, 2018 14 commits
    • Linus Torvalds's avatar
      Merge tag 'pci-v4.16-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · d48fcbd8
      Linus Torvalds authored
      Pull PCI fix from Bjorn Helgaas:
       "Fix a POWER9/powernv INTx regression from the merge window (Alexey
        Kardashevskiy)"
      
      * tag 'pci-v4.16-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        powerpc/pci: Fix broken INTx configuration via OF
      d48fcbd8
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20180210' of git://git.kernel.dk/linux-block · 9454473c
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A few fixes to round off the merge window on the block side:
      
         - a set of bcache fixes by way of Michael Lyle, from the usual bcache
           suspects.
      
         - add a simple-to-hook-into function for bpf EIO error injection.
      
         - fix blk-wbt that mischarectized flushes as reads. Improve the logic
           so that flushes and writes are accounted as writes, and only reads
           as reads. From me.
      
         - fix requeue crash in BFQ, from Paolo"
      
      * tag 'for-linus-20180210' of git://git.kernel.dk/linux-block:
        block, bfq: add requeue-request hook
        bcache: fix for data collapse after re-attaching an attached device
        bcache: return attach error when no cache set exist
        bcache: set writeback_rate_update_seconds in range [1, 60] seconds
        bcache: fix for allocator and register thread race
        bcache: set error_limit correctly
        bcache: properly set task state in bch_writeback_thread()
        bcache: fix high CPU occupancy during journal
        bcache: add journal statistic
        block: Add should_fail_bio() for bpf error injection
        blk-wbt: account flush requests correctly
      9454473c
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v4.16-3' of git://github.com/dvhart/linux-pdx86 · cc5cb5af
      Linus Torvalds authored
      Pull x86 platform driver updates from Darren Hart:
       "Mellanox fixes and new system type support.
      
        Mostly data for new system types with a correction and an
        uninitialized variable fix"
      
      [ Pulling from github because git.infradead.org currently seems to be
        down for some reason, but Darren had a backup location    - Linus ]
      
      * tag 'platform-drivers-x86-v4.16-3' of git://github.com/dvhart/linux-pdx86:
        platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systems
        platform/x86: mlx-platform: Add support for new msn201x system type
        platform/x86: mlx-platform: Add support for new msn274x system type
        platform/x86: mlx-platform: Fix power cable setting for msn21xx family
        platform/x86: mlx-platform: Add define for the negative bus
        platform/x86: mlx-platform: Use defines for bus assignment
        platform/mellanox: mlxreg-hotplug: Fix uninitialized variable
      cc5cb5af
    • Linus Torvalds's avatar
      Merge tag 'chrome-platform-for-linus-4.16' of... · e9d46f74
      Linus Torvalds authored
      Merge tag 'chrome-platform-for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform
      
      Pull chrome platform updates from Benson Leung:
      
       - move cros_ec_dev to drivers/mfd
      
       - other small maintenance fixes
      
      [ The cros_ec_dev movement came in earlier through the MFD tree  - Linus ]
      
      * tag 'chrome-platform-for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
        platform/chrome: Use proper protocol transfer function
        platform/chrome: cros_ec_lpc: Add support for Google Glimmer
        platform/chrome: cros_ec_lpc: Register the driver if ACPI entry is missing.
        platform/chrome: cros_ec_lpc: remove redundant pointer request
        cros_ec: fix nul-termination for firmware build info
        platform/chrome: chromeos_laptop: make chromeos_laptop const
      e9d46f74
    • Linus Torvalds's avatar
      Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 15303ba5
      Linus Torvalds authored
      Pull KVM updates from Radim Krčmář:
       "ARM:
      
         - icache invalidation optimizations, improving VM startup time
      
         - support for forwarded level-triggered interrupts, improving
           performance for timers and passthrough platform devices
      
         - a small fix for power-management notifiers, and some cosmetic
           changes
      
        PPC:
      
         - add MMIO emulation for vector loads and stores
      
         - allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
           requiring the complex thread synchronization of older CPU versions
      
         - improve the handling of escalation interrupts with the XIVE
           interrupt controller
      
         - support decrement register migration
      
         - various cleanups and bugfixes.
      
        s390:
      
         - Cornelia Huck passed maintainership to Janosch Frank
      
         - exitless interrupts for emulated devices
      
         - cleanup of cpuflag handling
      
         - kvm_stat counter improvements
      
         - VSIE improvements
      
         - mm cleanup
      
        x86:
      
         - hypervisor part of SEV
      
         - UMIP, RDPID, and MSR_SMI_COUNT emulation
      
         - paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit
      
         - allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more
           AVX512 features
      
         - show vcpu id in its anonymous inode name
      
         - many fixes and cleanups
      
         - per-VCPU MSR bitmaps (already merged through x86/pti branch)
      
         - stable KVM clock when nesting on Hyper-V (merged through
           x86/hyperv)"
      
      * tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits)
        KVM: PPC: Book3S: Add MMIO emulation for VMX instructions
        KVM: PPC: Book3S HV: Branch inside feature section
        KVM: PPC: Book3S HV: Make HPT resizing work on POWER9
        KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code
        KVM: PPC: Book3S PR: Fix broken select due to misspelling
        KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs()
        KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled
        KVM: PPC: Book3S HV: Drop locks before reading guest memory
        kvm: x86: remove efer_reload entry in kvm_vcpu_stat
        KVM: x86: AMD Processor Topology Information
        x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested
        kvm: embed vcpu id to dentry of vcpu anon inode
        kvm: Map PFN-type memory regions as writable (if possible)
        x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n
        KVM: arm/arm64: Fixup userspace irqchip static key optimization
        KVM: arm/arm64: Fix userspace_irqchip_in_use counting
        KVM: arm/arm64: Fix incorrect timer_is_pending logic
        MAINTAINERS: update KVM/s390 maintainers
        MAINTAINERS: add Halil as additional vfio-ccw maintainer
        MAINTAINERS: add David as a reviewer for KVM/s390
        ...
      15303ba5
    • Alexey Kardashevskiy's avatar
      powerpc/pci: Fix broken INTx configuration via OF · c591c2e3
      Alexey Kardashevskiy authored
      59f47eff ("powerpc/pci: Use of_irq_parse_and_map_pci() helper")
      replaced of_irq_parse_pci() + irq_create_of_mapping() with
      of_irq_parse_and_map_pci(), but neglected to capture the virq
      returned by irq_create_of_mapping(), so virq remained zero, which
      caused INTx configuration to fail.
      
      Save the virq value returned by of_irq_parse_and_map_pci() and correct
      the virq declaration to match the of_irq_parse_and_map_pci() signature.
      
      Fixes: 59f47eff "powerpc/pci: Use of_irq_parse_and_map_pci() helper"
      Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      [bhelgaas: changelog]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      c591c2e3
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · 9a61df9e
      Linus Torvalds authored
      Pull more Kbuild updates from Masahiro Yamada:
       "Makefile changes:
         - enable unused-variable warning that was wrongly disabled for clang
      
        Kconfig changes:
         - warn about blank 'help' and fix existing instances
         - fix 'choice' behavior to not write out invisible symbols
         - fix misc weirdness
      
        Coccinell changes:
         - fix false positive of free after managed memory alloc detection
         - improve performance of NULL dereference detection"
      
      * tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (21 commits)
        kconfig: remove const qualifier from sym_expand_string_value()
        kconfig: add xrealloc() helper
        kconfig: send error messages to stderr
        kconfig: echo stdin to stdout if either is redirected
        kconfig: remove check_stdin()
        kconfig: remove 'config*' pattern from .gitignnore
        kconfig: show '?' prompt even if no help text is available
        kconfig: do not write choice values when their dependency becomes n
        coccinelle: deref_null: avoid useless computation
        coccinelle: devm_free: reduce false positives
        kbuild: clang: disable unused variable warnings only when constant
        kconfig: Warn if help text is blank
        nios2: kconfig: Remove blank help text
        arm: vt8500: kconfig: Remove blank help text
        MIPS: kconfig: Remove blank help text
        MIPS: BCM63XX: kconfig: Remove blank help text
        lib/Kconfig.debug: Remove blank help text
        Staging: rtl8192e: kconfig: Remove blank help text
        Staging: rtl8192u: kconfig: Remove blank help text
        mmc: kconfig: Remove blank help text
        ...
      9a61df9e
    • Al Viro's avatar
      7a501609
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 878e66d0
      Linus Torvalds authored
      Pull misc vfs fixes from Al Viro.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        seq_file: fix incomplete reset on read from zero offset
        kernfs: fix regression in kernfs_fop_write caused by wrong type
      878e66d0
    • Masahiro Yamada's avatar
      kconfig: remove const qualifier from sym_expand_string_value() · 523ca58b
      Masahiro Yamada authored
      This function returns realloc'ed memory, so the returned pointer
      must be passed to free() when done.  So, 'const' qualifier is odd.
      It is allowed to modify the expanded string.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      523ca58b
    • Masahiro Yamada's avatar
      kconfig: add xrealloc() helper · d717f24d
      Masahiro Yamada authored
      We already have xmalloc(), xcalloc().  Add xrealloc() as well
      to save tedious error handling.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      d717f24d
    • Vadim Pasternak's avatar
      platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systems · 1bd42d94
      Vadim Pasternak authored
      It adds support for new Mellanox system types of basic classes qmb7, sn34,
      sn37, containing systems QMB700 (40x200GbE InfiniBand switch), SN3700
      (32x200GbE and 16x400GbE Ethernet switch) and SN3410 (6x400GbE plus
      48x50GbE Ethernet switch). These are the Top of the Rack systems, equipped
      with Mellanox COM-Express carrier board and switch board with Mellanox
      Quantum device, which supports InfiniBand switching with 40X200G ports and
      line rate of up to HDR speed or with Mellanox Spectrum-2 device, which
      supports Ethernet switching with 32X200G ports line rate of up to HDR
      speed.
      Signed-off-by: default avatarVadim Pasternak <vadimp@mellanox.com>
      Signed-off-by: default avatarDarren Hart (VMware) <dvhart@infradead.org>
      1bd42d94
    • Vadim Pasternak's avatar
      platform/x86: mlx-platform: Add support for new msn201x system type · a49a4148
      Vadim Pasternak authored
      It adds support for new Mellanox system types of basic half unit size
      class msn201x, containing system MSN2010 (18x10GbE plus 4x4x25GbE) half
      and its derivatives. This is the Top of the Rack system, equipped with
      Mellanox Small Form Factor carrier board and switch board with Mellanox
      Spectrum device, which supports Ethernet switching with 32X100G ports line
      rate of up to EDR speed.
      Signed-off-by: default avatarVadim Pasternak <vadimp@mellanox.com>
      Signed-off-by: default avatarDarren Hart (VMware) <dvhart@infradead.org>
      a49a4148
    • Vadim Pasternak's avatar
      platform/x86: mlx-platform: Add support for new msn274x system type · ef08e14a
      Vadim Pasternak authored
      It adds support for new Mellanox system types of basic class msn274x,
      containing system MSN2740 (32x100GbE Ethernet switch with cost reduction)
      and its derivatives. These are the Top of the Rack system, equipped with
      Mellanox Small Form Factor carrier board and switch board with Mellanox
      Spectrum device, which supports Ethernet switching with 32X100G ports line
      rate of up to EDR speed.
      Signed-off-by: default avatarVadim Pasternak <vadimp@mellanox.com>
      Signed-off-by: default avatarDarren Hart (VMware) <dvhart@infradead.org>
      ef08e14a
  4. 09 Feb, 2018 1 commit
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c839682c
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Make allocations less aggressive in x_tables, from Minchal Hocko.
      
       2) Fix netfilter flowtable Kconfig deps, from Pablo Neira Ayuso.
      
       3) Fix connection loss problems in rtlwifi, from Larry Finger.
      
       4) Correct DRAM dump length for some chips in ath10k driver, from Yu
          Wang.
      
       5) Fix ABORT handling in rxrpc, from David Howells.
      
       6) Add SPDX tags to Sun networking drivers, from Shannon Nelson.
      
       7) Some ipv6 onlink handling fixes, from David Ahern.
      
       8) Netem packet scheduler interval calcualtion fix from Md. Islam.
      
       9) Don't put crypto buffers on-stack in rxrpc, from David Howells.
      
      10) Fix handling of error non-delivery status in netlink multicast
          delivery over multiple namespaces, from Nicolas Dichtel.
      
      11) Missing xdp flush in tuntap driver, from Jason Wang.
      
      12) Synchonize RDS protocol netns/module teardown with rds object
          management, from Sowini Varadhan.
      
      13) Add nospec annotations to mpls, from Dan Williams.
      
      14) Fix SKB truesize handling in TIPC, from Hoang Le.
      
      15) Interrupt masking fixes in stammc from Niklas Cassel.
      
      16) Don't allow ptr_ring objects to be sized outside of kmalloc's
          limits, from Jason Wang.
      
      17) Don't allow SCTP chunks to be built which will have a length
          exceeding the chunk header's 16-bit length field, from Alexey
          Kodanev.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (82 commits)
        ibmvnic: Remove skb->protocol checks in ibmvnic_xmit
        bpf: fix rlimit in reuseport net selftest
        sctp: verify size of a new chunk in _sctp_make_chunk()
        s390/qeth: fix SETIP command handling
        s390/qeth: fix underestimated count of buffer elements
        ptr_ring: try vmalloc() when kmalloc() fails
        ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE
        net: stmmac: remove redundant enable of PMT irq
        net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4
        net: stmmac: discard disabled flags in interrupt status register
        ibmvnic: Reset long term map ID counter
        tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
        selftests/bpf: add selftest that use test_libbpf_open
        selftests/bpf: add test program for loading BPF ELF files
        tools/libbpf: improve the pr_debug statements to contain section numbers
        bpf: Sync kernel ABI header with tooling header for bpf_common.h
        net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
        net: thunder: change q_len's type to handle max ring size
        tipc: fix skb truesize/datasize ratio control
        net/sched: cls_u32: fix cls_u32 on filter replace
        ...
      c839682c