1. 29 Aug, 2024 3 commits
  2. 28 Aug, 2024 2 commits
    • Song Liu's avatar
      Merge branch 'md-6.12-bitmap' into md-6.12 · 7f67fdae
      Song Liu authored
      From Yu Kuai (with minor changes by Song Liu):
      
      The background is that currently bitmap is using a global spin_lock,
      causing lock contention and huge IO performance degradation for all raid
      levels.
      
      However, it's impossible to implement a new lock free bitmap with
      current situation that md-bitmap exposes the internal implementation
      with lots of exported apis. Hence bitmap_operations is invented, to
      describe bitmap core implementation, and a new bitmap can be introduced
      with a new bitmap_operations, we only need to switch to the new one
      during initialization.
      
      And with this we can build bitmap as kernel module, but that's not
      our concern for now.
      
      This version was tested with mdadm tests and lvm2 tests. This set does
      not introduce new errors in these tests.
      
      * md-6.12-bitmap: (42 commits)
        md/md-bitmap: make in memory structure internal
        md/md-bitmap: merge md_bitmap_enabled() into bitmap_operations
        md/md-bitmap: merge md_bitmap_wait_behind_writes() into bitmap_operations
        md/md-bitmap: merge md_bitmap_free() into bitmap_operations
        md/md-bitmap: merge md_bitmap_set_pages() into struct bitmap_operations
        md/md-bitmap: merge md_bitmap_copy_from_slot() into struct bitmap_operation.
        md/md-bitmap: merge get_bitmap_from_slot() into bitmap_operations
        md/md-bitmap: merge md_bitmap_resize() into bitmap_operations
        md/md-bitmap: pass in mddev directly for md_bitmap_resize()
        md/md-bitmap: merge md_bitmap_daemon_work() into bitmap_operations
        md/md-bitmap: merge bitmap_unplug() into bitmap_operations
        md/md-bitmap: merge md_bitmap_unplug_async() into md_bitmap_unplug()
        md/md-bitmap: merge md_bitmap_sync_with_cluster() into bitmap_operations
        md/md-bitmap: merge md_bitmap_cond_end_sync() into bitmap_operations
        md/md-bitmap: merge md_bitmap_close_sync() into bitmap_operations
        md/md-bitmap: merge md_bitmap_end_sync() into bitmap_operations
        md/md-bitmap: remove the parameter 'aborted' for md_bitmap_end_sync()
        md/md-bitmap: merge md_bitmap_start_sync() into bitmap_operations
        md/md-bitmap: merge md_bitmap_endwrite() into bitmap_operations
        md/md-bitmap: merge md_bitmap_startwrite() into bitmap_operations
        ...
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      7f67fdae
    • Yu Kuai's avatar
      md: Remove flush handling · b75197e8
      Yu Kuai authored
      For flush request, md has a special flush handling to merge concurrent
      flush request into single one, however, the whole mechanism is based on
      a disk level spin_lock 'mddev->lock'. And fsync can be called quite
      often in some user cases, for consequence, spin lock from IO fast path can
      cause performance degradation.
      
      Fortunately, the block layer already has flush handling to merge
      concurrent flush request, and it only acquires hctx level spin lock. (see
      details in blk-flush.c)
      
      This patch removes the flush handling in md, and converts to use general
      block layer flush handling in underlying disks.
      
      Flush test for 4 nvme raid10:
      start 128 threads to do fsync 100000 times, on arm64, see how long it
      takes.
      
      Test script:
      void* thread_func(void* arg) {
          int fd = *(int*)arg;
          for (int i = 0; i < FSYNC_COUNT; i++) {
              fsync(fd);
          }
          return NULL;
      }
      
      int main() {
          int fd = open("/dev/md0", O_RDWR);
          if (fd < 0) {
              perror("open");
              exit(1);
          }
      
          pthread_t threads[THREADS];
          struct timeval start, end;
      
          gettimeofday(&start, NULL);
      
          for (int i = 0; i < THREADS; i++) {
              pthread_create(&threads[i], NULL, thread_func, &fd);
          }
      
          for (int i = 0; i < THREADS; i++) {
              pthread_join(threads[i], NULL);
          }
      
          gettimeofday(&end, NULL);
      
          close(fd);
      
          long long elapsed = (end.tv_sec - start.tv_sec) * 1000000LL + (end.tv_usec - start.tv_usec);
          printf("Elapsed time: %lld microseconds\n", elapsed);
      
          return 0;
      }
      
      Test result: about 10 times faster:
      Before this patch: 50943374 microseconds
      After this patch:  5096347  microseconds
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Link: https://lore.kernel.org/r/20240827110616.3860190-1-yukuai1@huaweicloud.comSigned-off-by: default avatarSong Liu <song@kernel.org>
      b75197e8
  3. 27 Aug, 2024 35 commits