1. 13 Apr, 2010 1 commit
    • Gui Jianfeng's avatar
      io-controller: Add a new interface "weight_device" for IO-Controller · 34d0f179
      Gui Jianfeng authored
      Currently, IO Controller makes use of blkio.weight to assign weight for
      all devices. Here a new user interface "blkio.weight_device" is introduced to
      assign different weights for different devices. blkio.weight becomes the
      default value for devices which are not configured by "blkio.weight_device"
      
      You can use the following format to assigned specific weight for a given
      device:
      #echo "major:minor weight" > blkio.weight_device
      
      major:minor represents device number.
      
      And you can remove weight for a given device as following:
      #echo "major:minor 0" > blkio.weight_device
      
      V1->V2 changes:
      - use user interface "weight_device" instead of "policy" suggested by Vivek
      - rename some struct suggested by Vivek
      - rebase to 2.6-block "for-linus" branch
      - remove an useless list_empty check pointed out by Li Zefan
      - some trivial typo fix
      
      V2->V3 changes:
      - Move policy_*_node() functions up to get rid of forward declarations
      - rename related functions by adding prefix "blkio_"
      Signed-off-by: default avatarGui Jianfeng <guijianfeng@cn.fujitsu.com>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      34d0f179
  2. 09 Apr, 2010 4 commits
    • Divyesh Shah's avatar
      blkio: Add more debug-only per-cgroup stats · 812df48d
      Divyesh Shah authored
      1) group_wait_time - This is the amount of time the cgroup had to wait to get a
        timeslice for one of its queues from when it became busy, i.e., went from 0
        to 1 request queued. This is different from the io_wait_time which is the
        cumulative total of the amount of time spent by each IO in that cgroup waiting
        in the scheduler queue. This stat is a great way to find out any jobs in the
        fleet that are being starved or waiting for longer than what is expected (due
        to an IO controller bug or any other issue).
      2) empty_time - This is the amount of time a cgroup spends w/o any pending
         requests. This stat is useful when a job does not seem to be able to use its
         assigned disk share by helping check if that is happening due to an IO
         controller bug or because the job is not submitting enough IOs.
      3) idle_time - This is the amount of time spent by the IO scheduler idling
         for a given cgroup in anticipation of a better request than the exising ones
         from other queues/cgroups.
      
      All these stats are recorded using start and stop events. When reading these
      stats, we do not add the delta between the current time and the last start time
      if we're between the start and stop events. We avoid doing this to make sure
      that these numbers are always monotonically increasing when read. Since we're
      using sched_clock() which may use the tsc as its source, it may induce some
      inconsistency (due to tsc resync across cpus) if we included the current delta.
      
      Signed-off-by: Divyesh Shah<dpshah@google.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      812df48d
    • Divyesh Shah's avatar
      blkio: Add io_queued and avg_queue_size stats · cdc1184c
      Divyesh Shah authored
      These stats are useful for getting a feel for the queue depth of the cgroup,
      i.e., how filled up its queues are at a given instant and over the existence of
      the cgroup. This ability is useful when debugging problems in the wild as it
      helps understand the application's IO pattern w/o having to read through the
      userspace code (coz its tedious or just not available) or w/o the ability
      to run blktrace (since you may not have root access and/or not want to disturb
      performance).
      
      Signed-off-by: Divyesh Shah<dpshah@google.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      cdc1184c
    • Divyesh Shah's avatar
      blkio: Add io_merged stat · 812d4026
      Divyesh Shah authored
      This includes both the number of bios merged into requests belonging to this
      cgroup as well as the number of requests merged together.
      In the past, we've observed different merging behavior across upstream kernels,
      some by design some actual bugs. This stat helps a lot in debugging such
      problems when applications report decreased throughput with a new kernel
      version.
      
      This needed adding an extra elevator function to capture bios being merged as I
      did not want to pollute elevator code with blkiocg knowledge and hence needed
      the accounting invocation to come from CFQ.
      
      Signed-off-by: Divyesh Shah<dpshah@google.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      812d4026
    • Divyesh Shah's avatar
      blkio: Changes to IO controller additional stats patches · 84c124da
      Divyesh Shah authored
      that include some minor fixes and addresses all comments.
      
      Changelog: (most based on Vivek Goyal's comments)
      o renamed blkiocg_reset_write to blkiocg_reset_stats
      o more clarification in the documentation on io_service_time and io_wait_time
      o Initialize blkg->stats_lock
      o rename io_add_stat to blkio_add_stat and declare it static
      o use bool for direction and sync
      o derive direction and sync info from existing rq methods
      o use 12 for major:minor string length
      o define io_service_time better to cover the NCQ case
      o add a separate reset_stats interface
      o make the indexed stats a 2d array to simplify macro and function pointer code
      o blkio.time now exports in jiffies as before
      o Added stats description in patch description and
        Documentation/cgroup/blkio-controller.txt
      o Prefix all stats functions with blkio and make them static as applicable
      o replace IO_TYPE_MAX with IO_TYPE_TOTAL
      o Moved #define constant to top of blk-cgroup.c
      o Pass dev_t around instead of char *
      o Add note to documentation file about resetting stats
      o use BLK_CGROUP_MODULE in addition to BLK_CGROUP config option in #ifdef
        statements
      o Avoid struct request specific knowledge in blk-cgroup. blk-cgroup.h now has
        rq_direction() and rq_sync() functions which are used by CFQ and when using
        io-controller at a higher level, bio_* functions can be added.
      
      Signed-off-by: Divyesh Shah<dpshah@google.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      84c124da
  3. 06 Apr, 2010 1 commit
    • Matthew Garrett's avatar
      laptop-mode: Make flushes per-device · 31373d09
      Matthew Garrett authored
      One of the features of laptop-mode is that it forces a writeout of dirty
      pages if something else triggers a physical read or write from a device.
      The current implementation flushes pages on all devices, rather than only
      the one that triggered the flush. This patch alters the behaviour so that
      only the recently accessed block device is flushed, preventing other
      disks being spun up for no terribly good reason.
      Signed-off-by: default avatarMatthew Garrett <mjg@redhat.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      31373d09
  4. 02 Apr, 2010 7 commits
  5. 30 Mar, 2010 6 commits
  6. 29 Mar, 2010 21 commits