1. 31 Dec, 2008 6 commits
    • Linus Torvalds's avatar
      Merge branch 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6 · a4ba2e9e
      Linus Torvalds authored
      * 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
        agp/intel: Fix broken ® symbol in device name.
        agp/intel: add support for G41 chipset
      a4ba2e9e
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6 · 6de71484
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (98 commits)
        sparc: move select of ARCH_SUPPORTS_MSI
        sparc: drop SUN_IO
        sparc: unify sections.h
        sparc: use .data.init_task section for init_thread_union
        sparc: fix array overrun check in of_device_64.c
        sparc: unify module.c
        sparc64: prepare module_64.c for unification
        sparc64: use bit neutral Elf symbols
        sparc: unify module.h
        sparc: introduce CONFIG_BITS
        sparc: fix hardirq.h removal fallout
        sparc64: do not export pus_fs_struct
        sparc: use sparc64 version of scatterlist.h
        sparc: Commonize memcmp assembler.
        sparc: Unify strlen assembler.
        sparc: Add asm/asm.h
        sparc: Kill memcmp_32.S code which has been ifdef'd out for centuries.
        sparc: replace for_each_cpu_mask_nr with for_each_cpu
        sparc: fix sparse warnings in irq_32.c
        sparc: add include guards to kernel.h
        ...
      6de71484
    • Linus Torvalds's avatar
      Merge branch 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block · 1dff81f2
      Linus Torvalds authored
      * 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block: (43 commits)
        bio: get rid of bio_vec clearing
        bounce: don't rely on a zeroed bio_vec list
        cciss: simplify parameters to deregister_disk function
        cfq-iosched: fix race between exiting queue and exiting task
        loop: Do not call loop_unplug for not configured loop device.
        loop: Flush possible running bios when loop device is released.
        alpha: remove dead BIO_VMERGE_BOUNDARY
        Get rid of CONFIG_LSF
        block: make blk_softirq_init() static
        block: use min_not_zero in blk_queue_stack_limits
        block: add one-hit cache for disk partition lookup
        cfq-iosched: remove limit of dispatch depth of max 4 times quantum
        nbd: tell the block layer that it is not a rotational device
        block: get rid of elevator_t typedef
        aio: make the lookup_ioctx() lockless
        bio: add support for inlining a number of bio_vecs inside the bio
        bio: allow individual slabs in the bio_set
        bio: move the slab pointer inside the bio_set
        bio: only mempool back the largest bio_vec slab cache
        block: don't use plugging on SSD devices
        ...
      1dff81f2
    • Linus Torvalds's avatar
      Merge branch 'irq-core-for-linus' of... · 179475a3
      Linus Torvalds authored
      Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        x86, sparseirq: clean up Kconfig entry
        x86: turn CONFIG_SPARSE_IRQ off by default
        sparseirq: fix numa_migrate_irq_desc dependency and comments
        sparseirq: add kernel-doc notation for new member in irq_desc, -v2
        locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP
        sparseirq, xen: make sure irq_desc is allocated for interrupts
        sparseirq: fix !SMP building, #2
        x86, sparseirq: move irq_desc according to smp_affinity, v7
        proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQ
        sparse irqs: add irqnr.h to the user headers list
        sparse irqs: handle !GENIRQ platforms
        sparseirq: fix !SMP && !PCI_MSI && !HT_IRQ build
        sparseirq: fix Alpha build failure
        sparseirq: fix typo in !CONFIG_IO_APIC case
        x86, MSI: pass irq_cfg and irq_desc
        x86: MSI start irq numbering from nr_irqs_gsi
        x86: use NR_IRQS_LEGACY
        sparse irq_desc[] array: core kernel and x86 changes
        genirq: record IRQ_LEVEL in irq_desc[]
        irq.h: remove padding from irq_desc on 64bits
      179475a3
    • Linus Torvalds's avatar
      Merge branch 'timers-core-for-linus' of... · bb758e96
      Linus Torvalds authored
      Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        hrtimers: fix warning in kernel/hrtimer.c
        x86: make sure we really have an hpet mapping before using it
        x86: enable HPET on Fujitsu u9200
        linux/timex.h: cleanup for userspace
        posix-timers: simplify de_thread()->exit_itimers() path
        posix-timers: check ->it_signal instead of ->it_pid to validate the timer
        posix-timers: use "struct pid*" instead of "struct task_struct*"
        nohz: suppress needless timer reprogramming
        clocksource, acpi_pm.c: put acpi_pm_read_slow() under CONFIG_PCI
        nohz: no softirq pending warnings for offline cpus
        hrtimer: removing all ur callback modes, fix
        hrtimer: removing all ur callback modes, fix hotplug
        hrtimer: removing all ur callback modes
        x86: correct link to HPET timer specification
        rtc-cmos: export second NVRAM bank
      
      Fixed up conflicts in sound/drivers/pcsp/pcsp.c and sound/core/hrtimer.c
      manually.
      bb758e96
    • Linus Torvalds's avatar
      Merge branch 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip · 5f34fe1c
      Linus Torvalds authored
      * 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
        stacktrace: provide save_stack_trace_tsk() weak alias
        rcu: provide RCU options on non-preempt architectures too
        printk: fix discarding message when recursion_bug
        futex: clean up futex_(un)lock_pi fault handling
        "Tree RCU": scalable classic RCU implementation
        futex: rename field in futex_q to clarify single waiter semantics
        x86/swiotlb: add default swiotlb_arch_range_needs_mapping
        x86/swiotlb: add default phys<->bus conversion
        x86: unify pci iommu setup and allow swiotlb to compile for 32 bit
        x86: add swiotlb allocation functions
        swiotlb: consolidate swiotlb info message printing
        swiotlb: support bouncing of HighMem pages
        swiotlb: factor out copy to/from device
        swiotlb: add arch hook to force mapping
        swiotlb: allow architectures to override phys<->bus<->phys conversions
        swiotlb: add comment where we handle the overflow of a dma mask on 32 bit
        rcu: fix rcutorture behavior during reboot
        resources: skip sanity check of busy resources
        swiotlb: move some definitions to header
        swiotlb: allow architectures to override swiotlb pool allocation
        ...
      
      Fix up trivial conflicts in
        arch/x86/kernel/Makefile
        arch/x86/mm/init_32.c
        include/linux/hardirq.h
      as per Ingo's suggestions.
      5f34fe1c
  2. 29 Dec, 2008 34 commits
    • Jens Axboe's avatar
      bio: get rid of bio_vec clearing · d3f76110
      Jens Axboe authored
      We don't need to clear the memory used for adding bio_vec entries,
      since nobody should be looking at members unitialized. Any valid
      use should be below bio->bi_vcnt, and that members up until that count
      must be valid since they were added through bio_add_page().
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      d3f76110
    • Jens Axboe's avatar
      bounce: don't rely on a zeroed bio_vec list · f735b5ee
      Jens Axboe authored
      __blk_queue_bounce() relies on a zeroed bio_vec list, since it looks
      up arbitrary indexes in the allocated bio. The block layer only
      guarentees that added entries are valid, so clear memory after alloc.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      f735b5ee
    • Stephen M. Cameron's avatar
      cciss: simplify parameters to deregister_disk function · a0ea8622
      Stephen M. Cameron authored
      Simplify parameters to deregister_disk function.
      Signed-off-by: default avatarStephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      a0ea8622
    • Jens Axboe's avatar
      cfq-iosched: fix race between exiting queue and exiting task · 62c1fe9d
      Jens Axboe authored
      Original patch from Nikanth Karthikesan <knikanth@suse.de>
      
      When a queue exits the queue lock is taken and cfq_exit_queue() would free all
      the cic's associated with the queue.
      
      But when a task exits, cfq_exit_io_context() gets cic one by one and then
      locks the associated queue to call __cfq_exit_single_io_context. It looks like
      between getting a cic from the ioc and locking the queue, the queue might have
      exited on another cpu.
      
      Fix this by rechecking the cfq_io_context queue key inside the queue lock
      again, and not calling into __cfq_exit_single_io_context() if somebody
      beat us to it.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      62c1fe9d
    • Milan Broz's avatar
      loop: Do not call loop_unplug for not configured loop device. · 8ae30b89
      Milan Broz authored
      In loop_unplug() function is expected that mapping is set
      and lo->lo_backing_file is not NULL.
      
      Unfortunately loop_set_fd() set the request queue unplug function,
      but loop_clr_fd() doesn't clear that.
      
      Loop device allows open of non-configured loop in some situations.
      If the unplug on request queue is called, loop module oopses because
      of missing lo_backing_file.
      
      Simple reproducer:
      	losetup /dev/loop0 /xxx
      	losetup -d /dev/loop0
      	dmsetup create x --table "0 1 linear /dev/loop0 0"
      
       EIP is at loop_unplug+0x1d/0x3b
       ...
        Call Trace:
         blk_unplug+0x57/0x5e
         dm_table_unplug_all+0x34/0x77 [dm_mod]
         destroy_inode+0x27/0x38
         generic_delete_inode+0xd5/0xd9
         iput+0x4b/0x4e
         dm_resume+0xca/0xfe [dm_mod]
         dev_suspend+0x143/0x165 [dm_mod]
         dm_ctl_ioctl+0x18e/0x1cf [dm_mod]
         dev_suspend+0x0/0x165 [dm_mod]
         dm_ctl_ioctl+0x0/0x1cf [dm_mod]
         vfs_ioctl+0x22/0x69
         do_vfs_ioctl+0x39d/0x3c7
         trace_hardirqs_on+0xb/0xd
         remove_vma+0x50/0x56
         do_munmap+0x21c/0x237
         sys_ioctl+0x2c/0x45
         sysenter_do_call+0x12/0x31
      
      Several reports here
      http://www.kerneloops.org/search.php?search=loop_unplug
      
      Fix it by simply clear unplug function together with
      removing of backing file.
      Signed-off-by: default avatarMilan Broz <mbroz@redhat.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      8ae30b89
    • Milan Broz's avatar
      loop: Flush possible running bios when loop device is released. · 14f27939
      Milan Broz authored
      When there are still queued bios and reference count
      drops to zero, loop device must flush all queued bios.
      
      Otherwise it can lead to situation that caller
      closes the device, but some bios are still running
      and endio() function call later OOpses when uses
      unallocated mempool.
      
      This happens for example when running dm-crypt over loop,
      here is typical oops backtrace:
      
       Oops: 0000 [#1] PREEMPT SMP
       EIP is at mempool_free+0x12/0x6b
      ...
       crypt_dec_pending+0x50/0x54 [dm_crypt]
       crypt_endio+0x9f/0xa7 [dm_crypt]
       crypt_endio+0x0/0xa7 [dm_crypt]
       bio_endio+0x2b/0x2e
       loop_thread+0x37a/0x3b1
       do_lo_send_aops+0x0/0x165
       autoremove_wake_function+0x0/0x33
       loop_thread+0x0/0x3b1
       kthread+0x3b/0x61
       kthread+0x0/0x61
       kernel_thread_helper+0x7/0x10
      
      (But crash is reproducible with different dm targets
      running over loop device too.)
      
      Patch fixes it by flushing the bios in release call,
      reusing the flush mechanism for switching backing store.
      Signed-off-by: default avatarMilan Broz <mbroz@redhat.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      14f27939
    • FUJITA Tomonori's avatar
      alpha: remove dead BIO_VMERGE_BOUNDARY · 10e5b644
      FUJITA Tomonori authored
      The block layer dropped the virtual merge feature
      (b8b3e16c). BIO_VMERGE_BOUNDARY
      definition is meaningless now.
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      10e5b644
    • Jens Axboe's avatar
      Get rid of CONFIG_LSF · b3a6ffe1
      Jens Axboe authored
      We have two seperate config entries for large devices/files. One
      is CONFIG_LBD that guards just the devices, the other is CONFIG_LSF
      that handles large files. This doesn't make a lot of sense, you typically
      want both or none. So get rid of CONFIG_LSF and change CONFIG_LBD wording
      to indicate that it covers both.
      Acked-by: default avatarJean Delvare <khali@linux-fr.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      b3a6ffe1
    • Roel Kluin's avatar
      block: make blk_softirq_init() static · 3c18ce71
      Roel Kluin authored
      Sparse asked whether these could be static.
      Signed-off-by: default avatarRoel Kluin <roel.kluin@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      3c18ce71
    • FUJITA Tomonori's avatar
      block: use min_not_zero in blk_queue_stack_limits · 18af8b2c
      FUJITA Tomonori authored
      zero is invalid for max_phys_segments, max_hw_segments, and
      max_segment_size. It's better to use use min_not_zero instead of
      min. min() works though (because the commit 0e435ac2 makes sure that
      these values are set to the default values, non zero, if a queue is
      initialized properly).
      
      With this patch, blk_queue_stack_limits does the almost same thing
      that dm's combine_restrictions_low() does. I think that it's easy to
      remove dm's combine_restrictions_low.
      Signed-off-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      18af8b2c
    • Jens Axboe's avatar
      block: add one-hit cache for disk partition lookup · a6f23657
      Jens Axboe authored
      disk_map_sector_rcu() returns a partition from a sector offset,
      which we use for IO statistics on a per-partition basis. The
      lookup itself is an O(N) list lookup, where N is the number of
      partitions. This actually hurts performance quite a bit, even
      on the lower end partitions. On higher numbered partitions,
      it can get pretty bad.
      
      Solve this by adding a one-hit cache for partition lookup.
      This makes the lookup O(1) for the case where we do most IO to
      one partition. Even for mixed partition workloads, amortized cost
      is pretty close to O(1) since the natural IO batching makes the
      one-hit cache last for lots of IOs.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      a6f23657
    • Jens Axboe's avatar
      cfq-iosched: remove limit of dispatch depth of max 4 times quantum · 30e0dc28
      Jens Axboe authored
      This basically limits the hardware queue depth to 4*quantum at any
      point in time, which is 16 with the default settings. As CFQ uses
      other means to shrink the hardware queue when necessary in the first
      place, there's really no need for this extra heuristic. Additionally,
      it ends up hurting performance in some cases.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      30e0dc28
    • Jens Axboe's avatar
      nbd: tell the block layer that it is not a rotational device · 31dcfab0
      Jens Axboe authored
      Then we can get rid of that manual elevator type fiddling.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      31dcfab0
    • Jens Axboe's avatar
      block: get rid of elevator_t typedef · b374d18a
      Jens Axboe authored
      Just use struct elevator_queue everywhere instead.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      b374d18a
    • Jens Axboe's avatar
      aio: make the lookup_ioctx() lockless · abf137dd
      Jens Axboe authored
      The mm->ioctx_list is currently protected by a reader-writer lock,
      so we always grab that lock on the read side for doing ioctx
      lookups. As the workload is extremely reader biased, turn this into
      an rcu hlist so we can make lookup_ioctx() lockless. Get rid of
      the rwlock and use a spinlock for providing update side exclusion.
      
      There's usually only 1 entry on this list, so it doesn't make sense
      to look into fancier data structures.
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      abf137dd
    • Jens Axboe's avatar
      bio: add support for inlining a number of bio_vecs inside the bio · 392ddc32
      Jens Axboe authored
      When we go and allocate a bio for IO, we actually do two allocations.
      One for the bio itself, and one for the bi_io_vec that holds the
      actual pages we are interested in.
      
      This feature inlines a definable amount of io vecs inside the bio
      itself, so we eliminate the bio_vec array allocation for IO's up
      to a certain size. It defaults to 4 vecs, which is typically 16k
      of IO.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      392ddc32
    • Jens Axboe's avatar
      bio: allow individual slabs in the bio_set · bb799ca0
      Jens Axboe authored
      Instead of having a global bio slab cache, add a reference to one
      in each bio_set that is created. This allows for personalized slabs
      in each bio_set, so that they can have bios of different sizes.
      
      This means we can personalize the bios we return. File systems may
      want to embed the bio inside another structure, to avoid allocation
      more items (and stuffing them in ->bi_private) after the get a bio.
      Or we may want to embed a number of bio_vecs directly at the end
      of a bio, to avoid doing two allocations to return a bio. This is now
      possible.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      bb799ca0
    • Jens Axboe's avatar
      bio: move the slab pointer inside the bio_set · 1b434498
      Jens Axboe authored
      In preparation for adding differently sized bios.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      1b434498
    • Jens Axboe's avatar
      bio: only mempool back the largest bio_vec slab cache · 7ff9345f
      Jens Axboe authored
      We only very rarely need the mempool backing, so it makes sense to
      get rid of all but one of the mempool in a bio_set. So keep the
      largest bio_vec count mempool so we can always honor the largest
      allocation, and "upgrade" callers that fail.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      7ff9345f
    • Jens Axboe's avatar
      block: don't use plugging on SSD devices · a31a9738
      Jens Axboe authored
      We just want to hand the first bits of IO to the device as fast
      as possible. Gains a few percent on the IOPS rate.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      a31a9738
    • Tejun Heo's avatar
      block: fix empty barrier on write-through w/ ordered tag · a185eb4b
      Tejun Heo authored
      Empty barrier on write-through (or no cache) w/ ordered tag has no
      command to execute and without any command to execute ordered tag is
      never issued to the device and the ordering is never achieved.  Force
      draining for such cases.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      a185eb4b
    • Tejun Heo's avatar
      block: simplify empty barrier implementation · 58eea927
      Tejun Heo authored
      Empty barrier required special handling in __elv_next_request() to
      complete it without letting the low level driver see it.
      
      With previous changes, barrier code is now flexible enough to skip the
      BAR step using the same barrier sequence selection mechanism.  Drop
      the special handling and mask off q->ordered from start_ordered().
      
      Remove blk_empty_barrier() test which now has no user.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      58eea927
    • Tejun Heo's avatar
      block: make barrier completion more robust · 8f11b3e9
      Tejun Heo authored
      Barrier completion had the following assumptions.
      
      * start_ordered() couldn't finish the whole sequence properly.  If all
        actions are to be skipped, q->ordseq is set correctly but the actual
        completion was never triggered thus hanging the barrier request.
      
      * Drain completion in elv_complete_request() assumed that there's
        always at least one request in the queue when drain completes.
      
      Both assumptions are true but these assumptions need to be removed to
      improve empty barrier implementation.  This patch makes the following
      changes.
      
      * Make start_ordered() use blk_ordered_complete_seq() to mark skipped
        steps complete and notify __elv_next_request() that it should fetch
        the next request if the whole barrier has completed inside
        start_ordered().
      
      * Make drain completion path in elv_complete_request() check whether
        the queue is empty.  Empty queue also indicates drain completion.
      
      * While at it, convert 0/1 return from blk_do_ordered() to false/true.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      8f11b3e9
    • Tejun Heo's avatar
      block: make every barrier action optional · f671620e
      Tejun Heo authored
      In all barrier sequences, the barrier write itself was always assumed
      to be issued and thus didn't have corresponding control flag.  This
      patch adds QUEUE_ORDERED_DO_BAR and unify action mask handling in
      start_ordered() such that any barrier action can be skipped.
      
      This patch doesn't introduce any visible behavior changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      f671620e
    • Tejun Heo's avatar
      block: remove duplicate or unused barrier/discard error paths · a7384677
      Tejun Heo authored
      * Because barrier mode can be changed dynamically, whether barrier is
        supported or not can be determined only when actually issuing the
        barrier and there is no point in checking it earlier.  Drop barrier
        support check in generic_make_request() and __make_request(), and
        update comment around the support check in blk_do_ordered().
      
      * There is no reason to check discard support in both
        generic_make_request() and __make_request().  Drop the check in
        __make_request().  While at it, move error action block to the end
        of the function and add unlikely() to q existence test.
      
      * Barrier request, be it empty or not, is never passed to low level
        driver and thus it's meaningless to try to copy back req->sector to
        bio->bi_sector on error.  In addition, the notion of failed sector
        doesn't make any sense for empty barrier to begin with.  Drop the
        code block from __end_that_request_first().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      a7384677
    • Tejun Heo's avatar
      block: reorganize QUEUE_ORDERED_* constants · 313e4299
      Tejun Heo authored
      Separate out ordering type (drain,) and action masks (preflush,
      postflush, fua) from visible ordering mode selectors
      (QUEUE_ORDERED_*).  Ordering types are now named QUEUE_ORDERED_BY_*
      while action masks are named QUEUE_ORDERED_DO_*.
      
      This change is necessary to add QUEUE_ORDERED_DO_BAR and make it
      optional to improve empty barrier implementation.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      313e4299
    • Richard Kennedy's avatar
      block: reorder struct bio to remove padding on 64bit · ba744d5e
      Richard Kennedy authored
      Remove 8 bytes of padding from struct bio which also removes 16 bytes from
      struct bio_pair to make it 248 bytes.  bio_pair then fits into one fewer
      cache lines & into a smaller slab.
      Signed-off-by: default avatarRichard Kennedy <richard@rsk.demon.co.uk>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      ba744d5e
    • Cheng Renquan's avatar
      block: use cancel_work_sync() instead of kblockd_flush_work() · 64d01dc9
      Cheng Renquan authored
      After many improvements on kblockd_flush_work, it is now identical to
      cancel_work_sync, so a direct call to cancel_work_sync is suggested.
      
      The only difference is that cancel_work_sync is a GPL symbol,
      so no non-GPL modules anymore.
      Signed-off-by: default avatarCheng Renquan <crquan@gmail.com>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      64d01dc9
    • Keith Mannthey's avatar
      block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set · 08bafc03
      Keith Mannthey authored
      Allow the scsi request REQ_QUIET flag to be propagated to the buffer
      file system layer. The basic ideas is to pass the flag from the scsi
      request to the bio (block IO) and then to the buffer layer.  The buffer
      layer can then suppress needless printks.
      
      This patch declutters the kernel log by removed the 40-50 (per lun)
      buffer io error messages seen during a boot in my multipath setup . It
      is a good chance any real errors will be missed in the "noise" it the
      logs without this patch.
      
      During boot I see blocks of messages like
      "
      __ratelimit: 211 callbacks suppressed
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242847
      Buffer I/O error on device sdm, logical block 1
      Buffer I/O error on device sdm, logical block 5242878
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242879
      Buffer I/O error on device sdm, logical block 5242872
      "
      in my logs.
      
      My disk environment is multipath fiber channel using the SCSI_DH_RDAC
      code and multipathd.  This topology includes an "active" and "ghost"
      path for each lun. IO's to the "ghost" path will never complete and the
      SCSI layer, via the scsi device handler rdac code, quick returns the IOs
      to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
      layer messages.
      
       I am wanting to extend the QUIET behavior to include the buffer file
      system layer to deal with these errors as well. I have been running this
      patch for a while now on several boxes without issue.  A few runs of
      bonnie++ show no noticeable difference in performance in my setup.
      
      Thanks for John Stultz for the quiet_error finalization.
      Submitted-by: default avatarKeith Mannthey <kmannth@us.ibm.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      08bafc03
    • Wu Fengguang's avatar
      block: don't take lock on changing ra_pages · 7c239517
      Wu Fengguang authored
      There's no need to take queue_lock or kernel_lock when modifying
      bdi->ra_pages. So remove them. Also remove out of date comment for
      queue_max_sectors_store().
      Signed-off-by: default avatarWu Fengguang <wfg@linux.intel.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      7c239517
    • Nikanth Karthikesan's avatar
      Documentation: remove reference to ll_rw_blk.c and moved drivers/block/elevator.c · 42364690
      Nikanth Karthikesan authored
      The drivers/block/ll_rw_block.c has been split and organized in the block/
      directory, and also drivers/block/elevator.c has been moved to the block/
      directory. Update Documentation/block/biodoc.txt accordingly
      Signed-off-by: default avatarNikanth Karthikesan <knikanth@suse.de>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      42364690
    • Qinghuang Feng's avatar
      block/blk-tag.c: cleanup kernel-doc · c6a06f70
      Qinghuang Feng authored
      There is no argument named @tags in blk_init_tags,
      remove its' comment.
      Signed-off-by: default avatarQinghuang Feng <qhfeng.kernel@gmail.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      c6a06f70
    • Jens Axboe's avatar
      cciss: switch to using hlist for command list management · 8a3173de
      Jens Axboe authored
      This both cleans up the code and also helps detect the spurious case
      of a command attempted being removed from a queue it doesn't belong
      to.
      Acked-by: default avatarMike Miller <mike.miller@hp.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      8a3173de
    • Nikanth Karthikesan's avatar
      Do not free io context when taking recursive faults in do_exit · 7c0990c7
      Nikanth Karthikesan authored
      When taking recursive faults in do_exit, if the io_context is not null,
      exit_io_context() is being called. But it might decrement the refcount
      more than once. It is better to leave this task alone.
      Signed-off-by: default avatarNikanth Karthikesan <knikanth@suse.de>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      7c0990c7