• Ilya Dryomov's avatar
    libceph: use ceph_kvmalloc() for osdmap arrays · cf73d882
    Ilya Dryomov authored
    osdmap has a bunch of arrays that grow linearly with the number of
    OSDs.  osd_state, osd_weight and osd_primary_affinity take 4 bytes per
    OSD.  osd_addr takes 136 bytes per OSD because of sockaddr_storage.
    The CRUSH workspace area also grows linearly with the number of OSDs.
    
    Normally these arrays are allocated at client startup.  The osdmap is
    usually updated in small incrementals, but once in a while a full map
    may need to be processed.  For a cluster with 10000 OSDs, this means
    a bunch of 40K allocations followed by a 1.3M allocation, all of which
    are currently required to be physically contiguous.  This results in
    sporadic ENOMEM errors, hanging the client.
    
    Go back to manually (re)allocating arrays and use ceph_kvmalloc() to
    fall back to non-contiguous allocation when necessary.
    
    Link: https://tracker.ceph.com/issues/40481Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
    Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
    cf73d882
osdmap.c 60.3 KB