• Kirill Smelkov's avatar
    bigfile/pagemap: Fix non-leaf page iteration · ee9bcd00
    Kirill Smelkov authored
    Since the beginning of pagemap (45af76e6 "bigfile/pagemap: specialized
    {} uint64 -> void * mapping") we had a bug sitting in
    __pagemap_for_each_leaftab() (non-leaf iterating logic behind
    pagemap_for_each):
    
    After entry to stack-down was found, we did not updated tailv[l]
    accordingly. Thus if there are non-adjacent entries an entry could be
    e.g. emitted many times:
    
         l 3  __down 0x7f79da1ee000
         tailv[4]: 0x7f79da1ee000
          -> tailv[4] 0x7f79da1ee000  __down 0x7f79da1ed000
    
         l 4  __down 0x7f79da1ed000
         tailv[5]: 0x7f79da1ed000
         h 5  l 5  leaftab: 0x7f79da1ed000      <--
          lvl 5  idx 169  page 0x55aa
        ok 9 - pagemap_for_each(0) == 21930
    
         l 5  __down (nil)
         tailv[4]: 0x7f79da1ee008
          -> tailv[4] 0x7f79da1ee008  __down 0x7f79da1ed000
    
         l 4  __down 0x7f79da1ed000
         tailv[5]: 0x7f79da1ed000
         h 5  l 5  leaftab: 0x7f79da1ed000      <--
          lvl 5  idx 169  page 0x55aa
        not ok 10 - pagemap_for_each(1) == 140724106500272
    
    And many-time-emitted entries are not only incorrect, but can also lead
    to not-handled segmentation faults in e.g. fileh_close():
    
        https://lab.nexedi.com/nexedi/wendelin.core/blob/v0.6-1-gb0b2c52/bigfile/virtmem.c#L179
    
        /* drop all pages (dirty or not) associated with this fileh */
        pagemap_for_each(page, &fileh->pagemap) {
            /* it's an error to close fileh to mapping of which an access is
             * currently being done in another thread */
            BUG_ON(page->state == PAGE_LOADING);
            page_drop_memory(page);
            list_del(&page->lru);                           <-- HERE
            bzero(page, sizeof(*page)); /* just in case */
            free(page);
        }
    
    ( because after first bzero of a page, the page is all 0 bytes including
      page->lru{.next,.prev} so on the second time when the same page is
      emitted by pagemap_for_each, list_del(&page->lru) will try to set
      page->lru.next = ... which will segfault. )
    
    So fix it by properly updating tailv[l] while we scan/iterate current level.
    
    NOTE
    
    This applies only to non-leaf pagemap levels, as leaf level is scanned
    with separate loop in pagemap_for_each. That's why we probably did not
    noticed this earlier - up until now our usual workloads was to change
    data in adjacent batches and that means adjacent pages.
    
    Though today @Tyagov was playing with wendelin.core in some other way and
    it uncovered the bug.
    ee9bcd00
Name
Last commit
Last update
3rdparty Loading commit data...
bigarray Loading commit data...
bigfile Loading commit data...
demo Loading commit data...
include/wendelin Loading commit data...
lib Loading commit data...
t Loading commit data...
.gitignore Loading commit data...
.gitmodules Loading commit data...
CHANGELOG.rst Loading commit data...
COPYING Loading commit data...
Makefile Loading commit data...
README.rst Loading commit data...
setup.py Loading commit data...
tox.ini Loading commit data...
wendelin.py Loading commit data...