• Alasdair G Kergon's avatar
    [PATCH] device-mapper snapshot: replace sibling list · b4b610f6
    Alasdair G Kergon authored
    The siblings "list" is used unsafely at the moment.
    
    Firstly, only the element on the list being changed gets locked (via the
    snapshot lock), not the next and previous elements which have pointers that
    are also being changed.
    
    Secondly, if you have two or more snapshots and write to the same chunk a
    second time before every snapshot has finished making its private copy of the
    data, if you're unlucky, _origin_write() could attempt its list_merge() and
    dereference a 'last' pointer to a pending_exception structure that has just
    been freed.
    
    Analysis reveals that the list is actually only there for reference counting.
    If 5 pending_exceptions are needed in origin_write, then the 5 are joined
    together into a 5-element list - without a separate list head because there's
    nowhere suitable to store it.  As the pending_exceptions complete, they are
    removed from the list one-by-one and any contents of origin_bios get moved
    across to one of the remaining pending_exceptions on the list.  Whichever one
    is last is detected because list_empty() is then true and the origin_bios get
    submitted.
    
    The fix proposed here uses an alternative reference counting mechanism by
    choosing one of the pending_exceptions as primary and maintaining an atomic
    counter there.
    Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    b4b610f6
dm-snap.c 26.8 KB