• Sage Weil's avatar
    ceph: remove fragile __map_osds optimization · c99eb1c7
    Sage Weil authored
    We used to try to avoid freeing and then reallocating the osd
    struct.  This is a bit fragile due to potential interactions with
    other references (beyond o_requests), and may be the cause of
    this crash:
    
    [120633.442358] BUG: unable to handle kernel NULL pointer dereference at (null)
    [120633.443292] IP: [<ffffffff812549b6>] rb_erase+0x11d/0x277
    [120633.443292] PGD f7ff3067 PUD f7f53067 PMD 0
    [120633.443292] Oops: 0000 [#1] PREEMPT SMP
    [120633.443292] last sysfs file: /sys/kernel/uevent_seqnum
    [120633.443292] CPU 1
    [120633.443292] Modules linked in: ceph fan ac battery psmouse ehci_hcd ide_pci_generic ohci_hcd thermal processor button
    [120633.443292] Pid: 3023, comm: ceph-msgr/1 Not tainted 2.6.32-rc2 #12 H8SSL
    [120633.443292] RIP: 0010:[<ffffffff812549b6>]  [<ffffffff812549b6>] rb_erase+0x11d/0x277
    [120633.443292] RSP: 0018:ffff8800f7b13a50  EFLAGS: 00010246
    [120633.443292] RAX: ffff880022907819 RBX: ffff880022907818 RCX: 0000000000000000
    [120633.443292] RDX: ffff8800f7b13a80 RSI: ffff8800f587eb48 RDI: 0000000000000000
    [120633.443292] RBP: ffff8800f7b13a60 R08: 0000000000000000 R09: 0000000000000004
    [120633.443292] R10: 0000000000000000 R11: ffff8800c4441000 R12: ffff8800f587eb48
    [120633.443292] R13: ffff8800f58eaa00 R14: ffff8800f413c000 R15: 0000000000000001
    [120633.443292] FS:  00007fbef6e226e0(0000) GS:ffff880009200000(0000) knlGS:0000000000000000
    [120633.443292] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    [120633.443292] CR2: 0000000000000000 CR3: 00000000f7c53000 CR4: 00000000000006e0
    [120633.443292] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [120633.443292] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [120633.443292] Process ceph-msgr/1 (pid: 3023, threadinfo ffff8800f7b12000, task ffff8800f5858b40)
    [120633.443292] Stack:
    [120633.443292]  ffff8800f413c000 ffff8800f587e9c0 ffff8800f7b13a80 ffffffffa0098a86
    [120633.443292] <0> 00000000000006f1 0000000000000000 ffff8800f7b13af0 ffffffffa009959b
    [120633.443292] <0> ffff8800f413c000 ffff880022a68400 ffff880022a68400 ffff8800f587e9c0
    [120633.443292] Call Trace:
    [120633.443292]  [<ffffffffa0098a86>] __remove_osd+0x4d/0xbc [ceph]
    [120633.443292]  [<ffffffffa009959b>] __map_osds+0x199/0x4fa [ceph]
    [120633.443292]  [<ffffffffa00999f4>] ? __send_request+0xf8/0x186 [ceph]
    [120633.443292]  [<ffffffffa0099beb>] kick_requests+0x169/0x3cb [ceph]
    [120633.443292]  [<ffffffffa009a8c1>] ceph_osdc_handle_map+0x370/0x522 [ceph]
    
    Since we're probably screwed anyway if a small kmalloc is
    failing, don't bother with trying to be clever here.
    Signed-off-by: default avatarSage Weil <sage@newdream.net>
    c99eb1c7
osd_client.c 38.7 KB