• Shiraz Saleem's avatar
    RDMA/core: Update CMA destination address on rdma_resolve_addr · 0e158630
    Shiraz Saleem authored
    8d037973 ("RDMA/core: Refactor rdma_bind_addr") intoduces as regression
    on irdma devices on certain tests which uses rdma CM, such as cmtime.
    
    No connections can be established with the MAD QP experiences a fatal
    error on the active side.
    
    The cma destination address is not updated with the dst_addr when ULP
    on active side calls rdma_bind_addr followed by rdma_resolve_addr.
    The id_priv state is 'bound' in resolve_prepare_src and update is skipped.
    
    This leaves the dgid passed into irdma driver to create an Address Handle
    (AH) for the MAD QP at 0. The create AH descriptor as well as the ARP cache
    entry is invalid and HW throws an asynchronous events as result.
    
    [ 1207.656888] resolve_prepare_src caller: ucma_resolve_addr+0xff/0x170 [rdma_ucm] daddr=200.0.4.28 id_priv->state=7
    [....]
    [ 1207.680362] ice 0000:07:00.1 rocep7s0f1: caller: irdma_create_ah+0x3e/0x70 [irdma] ah_id=0 arp_idx=0 dest_ip=0.0.0.0
    destMAC=00:00:64:ca:b7:52 ipvalid=1 raw=0000:0000:0000:0000:0000:ffff:0000:0000
    [ 1207.682077] ice 0000:07:00.1 rocep7s0f1: abnormal ae_id = 0x401 bool qp=1 qp_id = 1, ae_src=5
    [ 1207.691657] infiniband rocep7s0f1: Fatal error (1) on MAD QP (1)
    
    Fix this by updating the CMA destination address when the ULP calls
    a resolve address with the CM state already bound.
    
    Fixes: 8d037973 ("RDMA/core: Refactor rdma_bind_addr")
    Signed-off-by: default avatarShiraz Saleem <shiraz.saleem@intel.com>
    Link: https://lore.kernel.org/r/20230712234133.1343-1-shiraz.saleem@intel.comSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
    0e158630
cma.c 143 KB