1. 11 Sep, 2009 9 commits
    • Jens Axboe's avatar
      block: improve queue_should_plug() by looking at IO depths · fb1e7538
      Jens Axboe authored
      Instead of just checking whether this device uses block layer
      tagging, we can improve the detection by looking at the maximum
      queue depth it has reached. If that crosses 4, then deem it a
      queuing device.
      
      This is important on high IOPS devices, since plugging hurts
      the performance there (it can be as much as 10-15% of the sys
      time).
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      fb1e7538
    • Jens Axboe's avatar
      bio: first step in sanitizing the bio->bi_rw flag testing · 1f98a13f
      Jens Axboe authored
      Get rid of any functions that test for these bits and make callers
      use bio_rw_flagged() directly. Then it is at least directly apparent
      what variable and flag they check.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      1f98a13f
    • Jens Axboe's avatar
      block: make bio_rw_flagged() return a bool · e7e503ae
      Jens Axboe authored
      Makes for a saner interface, instead of returning the bit position.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      e7e503ae
    • Hannes Reinecke's avatar
      Send uevents for write_protect changes · e3264a4d
      Hannes Reinecke authored
      Whenever a block device changes it's read-only attribute
      notify the userspace about it.
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarNikanth Karthikesan <knikanth@suse.de>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      e3264a4d
    • Vivek Goyal's avatar
      cfq-iosched: no need to keep track of busy_rt_queues · d58b85e1
      Vivek Goyal authored
      o Get rid of busy_rt_queues infrastructure. Looks like it is redundant.
      
      o Once an RT queue gets request it will preempt any of the BE or IDLE queues
        immediately. Otherwise this queue will be put on service tree and scheduler
        will anyway select this queue before any of the BE or IDLE queue. Hence
        looks like there is no need to keep track of how many busy RT queues are
        currently on service tree.
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      d58b85e1
    • Jens Axboe's avatar
      cfq-iosched: drain device queue before switching to a sync queue · 5ad531db
      Jens Axboe authored
      To lessen the impact of async IO on sync IO, let the device drain of
      any async IO in progress when switching to a sync cfqq that has idling
      enabled.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      5ad531db
    • Tejun Heo's avatar
      scsi,block: update SCSI to handle mixed merge failures · da6c5c72
      Tejun Heo authored
      Update scsi_io_completion() such that it only fails requests till the
      next error boundary and retry the leftover.  This enables block layer
      to merge requests with different failfast settings and still behave
      correctly on errors.  Allow merge of requests of different failfast
      settings.
      
      As SCSI is currently the only subsystem which follows failfast status,
      there's no need to worry about other block drivers for now.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Niel Lambrechts <niel.lambrechts@gmail.com>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      da6c5c72
    • Tejun Heo's avatar
      block: implement mixed merge of different failfast requests · 80a761fd
      Tejun Heo authored
      Failfast has characteristics from other attributes.  When issuing,
      executing and successuflly completing requests, failfast doesn't make
      any difference.  It only affects how a request is handled on failure.
      Allowing requests with different failfast settings to be merged cause
      normal IOs to fail prematurely while not allowing has performance
      penalties as failfast is used for read aheads which are likely to be
      located near in-flight or to-be-issued normal IOs.
      
      This patch introduces the concept of 'mixed merge'.  A request is a
      mixed merge if it is merge of segments which require different
      handling on failure.  Currently the only mixable attributes are
      failfast ones (or lack thereof).
      
      When a bio with different failfast settings is added to an existing
      request or requests of different failfast settings are merged, the
      merged request is marked mixed.  Each bio carries failfast settings
      and the request always tracks failfast state of the first bio.  When
      the request fails, blk_rq_err_bytes() can be used to determine how
      many bytes can be safely failed without crossing into an area which
      requires further retrials.
      
      This allows request merging regardless of failfast settings while
      keeping the failure handling correct.
      
      This patch only implements mixed merge but doesn't enable it.  The
      next one will update SCSI to make use of mixed merge.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Niel Lambrechts <niel.lambrechts@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      80a761fd
    • Tejun Heo's avatar
      block: use the same failfast bits for bio and request · a82afdfc
      Tejun Heo authored
      bio and request use the same set of failfast bits.  This patch makes
      the following changes to simplify things.
      
      * enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_*
        bits coincide with __REQ_FAILFAST_* bits.
      
      * The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV
        but the matching is useless anyway.  init_request_from_bio() is
        responsible for setting FAILFAST bits on FS requests and non-FS
        requests never use BIO_RW_AHEAD.  Drop the code and comment from
        blk_rq_bio_prep().
      
      * Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and
        simplify FAILFAST flags handling in init_request_from_bio().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      a82afdfc
  2. 10 Sep, 2009 3 commits
    • Geert Uytterhoeven's avatar
      md: Fix "strchr" [drivers/md/dm-log-userspace.ko] undefined! · 0d03d59d
      Geert Uytterhoeven authored
      Commit b8313b6d ("dm log: remove incorrect
      field from userspace table output") added a call to strstr() with a
      single-character "needle" string parameter.
      
      Unfortunately some versions of gcc replace such calls to strstr() by calls
      to strchr() behind our back.  This causes linking errors if strchr() is
      defined as an inline function in <asm/string.h> (e.g. on m68k):
      
      | WARNING: "strchr" [drivers/md/dm-log-userspace.ko] undefined!
      
      Avoid this by explicitly calling strchr() instead.
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0d03d59d
    • Linus Torvalds's avatar
      Merge branch 'lookup-permissions-cleanup' · 526b6780
      Linus Torvalds authored
      * lookup-permissions-cleanup:
        jffs2/jfs/xfs: switch over to 'check_acl' rather than 'permission()'
        ext[234]: move over to 'check_acl' permission model
        shmfs: use 'check_acl' instead of 'permission'
        Make 'check_acl()' a first-class filesystem op
        Simplify exec_permission_lite(), part 3
        Simplify exec_permission_lite() further
        Simplify exec_permission_lite() logic
        Do not call 'ima_path_check()' for each path component
      526b6780
    • Roland McGrath's avatar
      binfmt_elf: fix PT_INTERP bss handling · 752015d1
      Roland McGrath authored
      In fs/binfmt_elf.c, load_elf_interp() calls padzero() for .bss even if
      the PT_LOAD has no PROT_WRITE and no .bss.  This generates EFAULT.
      
      Here is a small test case.  (Yes, there are other, useful PT_INTERP
      which have only .text and no .data/.bss.)
      
      	----- ptinterp.S
      	_start: .globl _start
      		 nop
      		 int3
      	-----
      	$ gcc -m32 -nostartfiles -nostdlib -o ptinterp ptinterp.S
      	$ gcc -m32 -Wl,--dynamic-linker=ptinterp -o hello hello.c
      	$ ./hello
      	Segmentation fault  # during execve() itself
      
      	After applying the patch:
      	$ ./hello
      	Trace trap  # user-mode execution after execve() finishes
      
      If the ELF headers are actually self-inconsistent, then dying is fine.
      But having no PROT_WRITE segment is perfectly normal and correct if
      there is no segment with p_memsz > p_filesz (i.e. bss).  John Reiser
      suggested checking for PROT_WRITE in the bss logic.  I think it makes
      most sense to simply apply the bss logic only when there is bss.
      
      This patch looks less trivial than it is due to some reindentation.
      It just moves the "if (last_bss > elf_bss) {" test up to include the
      partial-page bss logic as well as the more-pages bss logic.
      Reported-by: default avatarJohn Reiser <jreiser@bitwagon.com>
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      752015d1
  3. 09 Sep, 2009 3 commits
    • Linus Torvalds's avatar
      Linux 2.6.31 · 74fca6a4
      Linus Torvalds authored
      74fca6a4
    • Ed Cashin's avatar
      aoe: allocate unused request_queue for sysfs · 7135a71b
      Ed Cashin authored
      Andy Whitcroft reported an oops in aoe triggered by use of an
      incorrectly initialised request_queue object:
      
        [ 2645.959090] kobject '<NULL>' (ffff880059ca22c0): tried to add
      		an uninitialized object, something is seriously wrong.
        [ 2645.959104] Pid: 6, comm: events/0 Not tainted 2.6.31-5-generic #24-Ubuntu
        [ 2645.959107] Call Trace:
        [ 2645.959139] [<ffffffff8126ca2f>] kobject_add+0x5f/0x70
        [ 2645.959151] [<ffffffff8125b4ab>] blk_register_queue+0x8b/0xf0
        [ 2645.959155] [<ffffffff8126043f>] add_disk+0x8f/0x160
        [ 2645.959161] [<ffffffffa01673c4>] aoeblk_gdalloc+0x164/0x1c0 [aoe]
      
      The request queue of an aoe device is not used but can be allocated in
      code that does not sleep.
      
      Bruno bisected this regression down to
      
        cd43e26f
      
        block: Expose stacked device queues in sysfs
      
      "This seems to generate /sys/block/$device/queue and its contents for
       everyone who is using queues, not just for those queues that have a
       non-NULL queue->request_fn."
      
      Addresses http://bugs.launchpad.net/bugs/410198
      Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13942
      
      Note that embedding a queue inside another object has always been
      an illegal construct, since the queues are reference counted and
      must persist until the last reference is dropped. So aoe was
      always buggy in this respect (Jens).
      Signed-off-by: default avatarEd Cashin <ecashin@coraid.com>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Cc: Bruno Premont <bonbons@linux-vserver.org>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      7135a71b
    • Linus Torvalds's avatar
      i915: disable interrupts before tearing down GEM state · e6890f6f
      Linus Torvalds authored
      Reinette Chatre reports a frozen system (with blinking keyboard LEDs)
      when switching from graphics mode to the text console, or when
      suspending (which does the same thing). With netconsole, the oops
      turned out to be
      
      	BUG: unable to handle kernel NULL pointer dereference at 0000000000000084
      	IP: [<ffffffffa03ecaab>] i915_driver_irq_handler+0x26b/0xd20 [i915]
      
      and it's due to the i915_gem.c code doing drm_irq_uninstall() after
      having done i915_gem_idle(). And the i915_gem_idle() path will do
      
        i915_gem_idle() ->
          i915_gem_cleanup_ringbuffer() ->
            i915_gem_cleanup_hws() ->
              dev_priv->hw_status_page = NULL;
      
      but if an i915 interrupt comes in after this stage, it may want to
      access that hw_status_page, and gets the above NULL pointer dereference.
      
      And since the NULL pointer dereference happens from within an interrupt,
      and with the screen still in graphics mode, the common end result is
      simply a silently hung machine.
      
      Fix it by simply uninstalling the irq handler before idling rather than
      after. Fixes
      
          http://bugzilla.kernel.org/show_bug.cgi?id=13819Reported-and-tested-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Acked-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e6890f6f
  4. 08 Sep, 2009 9 commits
  5. 07 Sep, 2009 7 commits
  6. 06 Sep, 2009 1 commit
    • David S. Miller's avatar
      gianfar: Fix build. · d9d8e041
      David S. Miller authored
      Reported by Michael Guntsche <mike@it-loops.com>
      
      --------------------
      Commit
      38bddf04 gianfar: gfar_remove needs to call unregister_netdev()
      
      breaks the build of the gianfar driver because "dev" is undefined in
      this function. To quickly test rc9 I changed this to priv->ndev but I do
      not know if this is the correct one.
      --------------------
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9d8e041
  7. 05 Sep, 2009 8 commits