1. 09 Apr, 2023 2 commits
    • Saravanan Vajravel's avatar
      RDMA/srpt: Add a check for valid 'mad_agent' pointer · eca5cd94
      Saravanan Vajravel authored
      When unregistering MAD agent, srpt module has a non-null check
      for 'mad_agent' pointer before invoking ib_unregister_mad_agent().
      This check can pass if 'mad_agent' variable holds an error value.
      The 'mad_agent' can have an error value for a short window when
      srpt_add_one() and srpt_remove_one() is executed simultaneously.
      
      In srpt module, added a valid pointer check for 'sport->mad_agent'
      before unregistering MAD agent.
      
      This issue can hit when RoCE driver unregisters ib_device
      
      Stack Trace:
      ------------
      BUG: kernel NULL pointer dereference, address: 000000000000004d
      PGD 145003067 P4D 145003067 PUD 2324fe067 PMD 0
      Oops: 0002 [#1] PREEMPT SMP NOPTI
      CPU: 10 PID: 4459 Comm: kworker/u80:0 Kdump: loaded Tainted: P
      Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS 2.5.4 01/13/2020
      Workqueue: bnxt_re bnxt_re_task [bnxt_re]
      RIP: 0010:_raw_spin_lock_irqsave+0x19/0x40
      Call Trace:
        ib_unregister_mad_agent+0x46/0x2f0 [ib_core]
        IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
        ? __schedule+0x20b/0x560
        srpt_unregister_mad_agent+0x93/0xd0 [ib_srpt]
        srpt_remove_one+0x20/0x150 [ib_srpt]
        remove_client_context+0x88/0xd0 [ib_core]
        bond0: (slave p2p1): link status definitely up, 100000 Mbps full duplex
        disable_device+0x8a/0x160 [ib_core]
        bond0: active interface up!
        ? kernfs_name_hash+0x12/0x80
       (NULL device *): Bonding Info Received: rdev: 000000006c0b8247
        __ib_unregister_device+0x42/0xb0 [ib_core]
       (NULL device *):         Master: mode: 4 num_slaves:2
        ib_unregister_device+0x22/0x30 [ib_core]
       (NULL device *):         Slave: id: 105069936 name:p2p1 link:0 state:0
        bnxt_re_stopqps_and_ib_uninit+0x83/0x90 [bnxt_re]
        bnxt_re_alloc_lag+0x12e/0x4e0 [bnxt_re]
      
      Fixes: a42d985b ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
      Reviewed-by: default avatarSelvin Xavier <selvin.xavier@broadcom.com>
      Reviewed-by: default avatarKashyap Desai <kashyap.desai@broadcom.com>
      Signed-off-by: default avatarSaravanan Vajravel <saravanan.vajravel@broadcom.com>
      Link: https://lore.kernel.org/r/20230406042549.507328-1-saravanan.vajravel@broadcom.comReviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      eca5cd94
    • Mark Zhang's avatar
      RDMA/cm: Trace icm_send_rej event before the cm state is reset · bd9de1ba
      Mark Zhang authored
      Trace icm_send_rej event before the cm state is reset to idle, so that
      correct cm state will be logged. For example when an incoming request is
      rejected, the old trace log was:
          icm_send_rej: local_id=961102742 remote_id=3829151631 state=IDLE reason=REJ_CONSUMER_DEFINED
      With this patch:
          icm_send_rej: local_id=312971016 remote_id=3778819983 state=MRA_REQ_SENT reason=REJ_CONSUMER_DEFINED
      
      Fixes: 8dc105be ("RDMA/cm: Add tracepoints to track MAD send operations")
      Signed-off-by: default avatarMark Zhang <markzhang@nvidia.com>
      Link: https://lore.kernel.org/r/20230330072351.481200-1-markzhang@nvidia.comSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      bd9de1ba
  2. 04 Apr, 2023 7 commits
  3. 03 Apr, 2023 7 commits
  4. 30 Mar, 2023 1 commit
  5. 29 Mar, 2023 8 commits
  6. 24 Mar, 2023 12 commits
  7. 23 Mar, 2023 3 commits
    • Gustavo A. R. Silva's avatar
      RDMA/core: Fix multiple -Warray-bounds warnings · aa4d540b
      Gustavo A. R. Silva authored
      GCC-13 (and Clang)[1] does not like to access a partially allocated
      object, since it cannot reason about it for bounds checking.
      
      In this case 140 bytes are allocated for an object of type struct
      ib_umad_packet:
      
              packet = kzalloc(sizeof(*packet) + IB_MGMT_RMPP_HDR, GFP_KERNEL);
      
      However, notice that sizeof(*packet) is only 104 bytes:
      
      struct ib_umad_packet {
              struct ib_mad_send_buf *   msg;                  /*     0     8 */
              struct ib_mad_recv_wc *    recv_wc;              /*     8     8 */
              struct list_head           list;                 /*    16    16 */
              int                        length;               /*    32     4 */
      
              /* XXX 4 bytes hole, try to pack */
      
              struct ib_user_mad         mad __attribute__((__aligned__(8))); /*    40    64 */
      
              /* size: 104, cachelines: 2, members: 5 */
              /* sum members: 100, holes: 1, sum holes: 4 */
              /* forced alignments: 1, forced holes: 1, sum forced holes: 4 */
              /* last cacheline: 40 bytes */
      } __attribute__((__aligned__(8)));
      
      and 36 bytes extra bytes are allocated for a flexible-array member in
      struct ib_user_mad:
      
      include/rdma/ib_mad.h:
      120 enum {
      ...
      123         IB_MGMT_RMPP_HDR = 36,
      ... }
      
      struct ib_user_mad {
              struct ib_user_mad_hdr     hdr;                  /*     0    64 */
              /* --- cacheline 1 boundary (64 bytes) --- */
              __u64                      data[] __attribute__((__aligned__(8))); /*    64     0 */
      
              /* size: 64, cachelines: 1, members: 2 */
              /* forced alignments: 1 */
      } __attribute__((__aligned__(8)));
      
      So we have sizeof(*packet) + IB_MGMT_RMPP_HDR == 140 bytes
      
      Then the address of the flex-array member (for which only 36 bytes were
      allocated) is casted and copied into a pointer to struct ib_rmpp_mad,
      which, in turn, is of size 256 bytes:
      
              rmpp_mad = (struct ib_rmpp_mad *) packet->mad.data;
      
      struct ib_rmpp_mad {
              struct ib_mad_hdr          mad_hdr;              /*     0    24 */
              struct ib_rmpp_hdr         rmpp_hdr;             /*    24    12 */
              u8                         data[220];            /*    36   220 */
      
              /* size: 256, cachelines: 4, members: 3 */
      };
      
      The thing is that those 36 bytes allocated for flex-array member data
      in struct ib_user_mad onlly account for the size of both struct ib_mad_hdr
      and struct ib_rmpp_hdr, but nothing is left for array u8 data[220].
      So, the compiler is legitimately complaining about accessing an object
      for which not enough memory was allocated.
      
      Apparently, the only members of struct ib_rmpp_mad that are relevant
      (that are actually being used) in function ib_umad_write() are mad_hdr
      and rmpp_hdr. So, instead of casting packet->mad.data to
      (struct ib_rmpp_mad *) create a new structure
      
      struct ib_rmpp_mad_hdr {
              struct ib_mad_hdr       mad_hdr;
              struct ib_rmpp_hdr      rmpp_hdr;
      } __packed;
      
      and cast packet->mad.data to (struct ib_rmpp_mad_hdr *).
      
      Notice that
      
              IB_MGMT_RMPP_HDR == sizeof(struct ib_rmpp_mad_hdr) == 36 bytes
      
      Refactor the rest of the code, accordingly.
      
      Fix the following warnings seen under GCC-13 and -Warray-bounds:
      drivers/infiniband/core/user_mad.c:564:50: warning: array subscript ‘struct ib_rmpp_mad[0]’ is partly outside array bounds of ‘unsigned char[140]’ [-Warray-bounds=]
      drivers/infiniband/core/user_mad.c:566:42: warning: array subscript ‘struct ib_rmpp_mad[0]’ is partly outside array bounds of ‘unsigned char[140]’ [-Warray-bounds=]
      drivers/infiniband/core/user_mad.c:618:25: warning: array subscript ‘struct ib_rmpp_mad[0]’ is partly outside array bounds of ‘unsigned char[140]’ [-Warray-bounds=]
      drivers/infiniband/core/user_mad.c:622:44: warning: array subscript ‘struct ib_rmpp_mad[0]’ is partly outside array bounds of ‘unsigned char[140]’ [-Warray-bounds=]
      
      Link: https://github.com/KSPP/linux/issues/273
      Link: https://godbolt.org/z/oYWaGM4Yb [1]
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Link: https://lore.kernel.org/r/ZBpB91qQcB10m3Fw@workSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      aa4d540b
    • Leon Romanovsky's avatar
      Enable IB out-of-order by default in mlx5 · 602fb420
      Leon Romanovsky authored
      This series from Or changes default of IB out-of-order feature and
      allows to the RDMA users to decide if they need to wait for completion
      for all segments or it is enough to wait for last segment completion only.
      
      Thanks
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      602fb420
    • Or Har-Toov's avatar