Commits · fb1e75389bd06fd5987e9cda1b4e0305c782f854 · nexedi / linux

11 Sep, 2009 9 commits

block: improve queue_should_plug() by looking at IO depths · fb1e7538

Jens Axboe authored Jul 30, 2009

Instead of just checking whether this device uses block layer
tagging, we can improve the detection by looking at the maximum
queue depth it has reached. If that crosses 4, then deem it a
queuing device.

This is important on high IOPS devices, since plugging hurts
the performance there (it can be as much as 10-15% of the sys
time).
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

fb1e7538

bio: first step in sanitizing the bio->bi_rw flag testing · 1f98a13f

Jens Axboe authored Sep 11, 2009

Get rid of any functions that test for these bits and make callers
use bio_rw_flagged() directly. Then it is at least directly apparent
what variable and flag they check.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

1f98a13f

block: make bio_rw_flagged() return a bool · e7e503ae

Jens Axboe authored Jul 28, 2009

Makes for a saner interface, instead of returning the bit position.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

e7e503ae

Send uevents for write_protect changes · e3264a4d

Hannes Reinecke authored Jul 28, 2009

Whenever a block device changes it's read-only attribute
notify the userspace about it.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

e3264a4d

cfq-iosched: no need to keep track of busy_rt_queues · d58b85e1

Vivek Goyal authored Jul 10, 2009

o Get rid of busy_rt_queues infrastructure. Looks like it is redundant.

o Once an RT queue gets request it will preempt any of the BE or IDLE queues
immediately. Otherwise this queue will be put on service tree and scheduler
will anyway select this queue before any of the BE or IDLE queue. Hence
looks like there is no need to keep track of how many busy RT queues are
currently on service tree.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

d58b85e1

cfq-iosched: drain device queue before switching to a sync queue · 5ad531db

Jens Axboe authored Jul 03, 2009

To lessen the impact of async IO on sync IO, let the device drain of
any async IO in progress when switching to a sync cfqq that has idling
enabled.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

5ad531db

scsi,block: update SCSI to handle mixed merge failures · da6c5c72

Tejun Heo authored Sep 11, 2009

Update scsi_io_completion() such that it only fails requests till the
next error boundary and retry the leftover.  This enables block layer
to merge requests with different failfast settings and still behave
correctly on errors.  Allow merge of requests of different failfast
settings.

As SCSI is currently the only subsystem which follows failfast status,
there's no need to worry about other block drivers for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Niel Lambrechts <niel.lambrechts@gmail.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

da6c5c72

block: implement mixed merge of different failfast requests · 80a761fd

Tejun Heo authored Jul 03, 2009

Failfast has characteristics from other attributes.  When issuing,
executing and successuflly completing requests, failfast doesn't make
any difference.  It only affects how a request is handled on failure.
Allowing requests with different failfast settings to be merged cause
normal IOs to fail prematurely while not allowing has performance
penalties as failfast is used for read aheads which are likely to be
located near in-flight or to-be-issued normal IOs.

This patch introduces the concept of 'mixed merge'.  A request is a
mixed merge if it is merge of segments which require different
handling on failure.  Currently the only mixable attributes are
failfast ones (or lack thereof).

When a bio with different failfast settings is added to an existing
request or requests of different failfast settings are merged, the
merged request is marked mixed.  Each bio carries failfast settings
and the request always tracks failfast state of the first bio.  When
the request fails, blk_rq_err_bytes() can be used to determine how
many bytes can be safely failed without crossing into an area which
requires further retrials.

This allows request merging regardless of failfast settings while
keeping the failure handling correct.

This patch only implements mixed merge but doesn't enable it.  The
next one will update SCSI to make use of mixed merge.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Niel Lambrechts <niel.lambrechts@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

80a761fd

block: use the same failfast bits for bio and request · a82afdfc

Tejun Heo authored Jul 03, 2009

bio and request use the same set of failfast bits.  This patch makes
the following changes to simplify things.

* enumify BIO_RW* bits and reorder bits such that BIOS_RW_FAILFAST_*
  bits coincide with __REQ_FAILFAST_* bits.

* The above pushes BIO_RW_AHEAD out of sync with __REQ_FAILFAST_DEV
  but the matching is useless anyway.  init_request_from_bio() is
  responsible for setting FAILFAST bits on FS requests and non-FS
  requests never use BIO_RW_AHEAD.  Drop the code and comment from
  blk_rq_bio_prep().

* Define REQ_FAILFAST_MASK which is OR of all FAILFAST bits and
  simplify FAILFAST flags handling in init_request_from_bio().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

a82afdfc

10 Sep, 2009 3 commits

md: Fix "strchr" [drivers/md/dm-log-userspace.ko] undefined! · 0d03d59d

Geert Uytterhoeven authored Sep 10, 2009

Commit b8313b6d ("dm log: remove incorrect
field from userspace table output") added a call to strstr() with a
single-character "needle" string parameter.

Unfortunately some versions of gcc replace such calls to strstr() by calls
to strchr() behind our back.  This causes linking errors if strchr() is
defined as an inline function in <asm/string.h> (e.g. on m68k):

| WARNING: "strchr" [drivers/md/dm-log-userspace.ko] undefined!

Avoid this by explicitly calling strchr() instead.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0d03d59d

Merge branch 'lookup-permissions-cleanup' · 526b6780

Linus Torvalds authored Sep 09, 2009

* lookup-permissions-cleanup:
  jffs2/jfs/xfs: switch over to 'check_acl' rather than 'permission()'
  ext[234]: move over to 'check_acl' permission model
  shmfs: use 'check_acl' instead of 'permission'
  Make 'check_acl()' a first-class filesystem op
  Simplify exec_permission_lite(), part 3
  Simplify exec_permission_lite() further
  Simplify exec_permission_lite() logic
  Do not call 'ima_path_check()' for each path component

526b6780

binfmt_elf: fix PT_INTERP bss handling · 752015d1

Roland McGrath authored Sep 08, 2009

In fs/binfmt_elf.c, load_elf_interp() calls padzero() for .bss even if
the PT_LOAD has no PROT_WRITE and no .bss.  This generates EFAULT.

Here is a small test case.  (Yes, there are other, useful PT_INTERP
which have only .text and no .data/.bss.)

	----- ptinterp.S
	_start: .globl _start
		 nop
		 int3
	-----
	$ gcc -m32 -nostartfiles -nostdlib -o ptinterp ptinterp.S
	$ gcc -m32 -Wl,--dynamic-linker=ptinterp -o hello hello.c
	$ ./hello
	Segmentation fault  # during execve() itself

	After applying the patch:
	$ ./hello
	Trace trap  # user-mode execution after execve() finishes

If the ELF headers are actually self-inconsistent, then dying is fine.
But having no PROT_WRITE segment is perfectly normal and correct if
there is no segment with p_memsz > p_filesz (i.e. bss).  John Reiser
suggested checking for PROT_WRITE in the bss logic.  I think it makes
most sense to simply apply the bss logic only when there is bss.

This patch looks less trivial than it is due to some reindentation.
It just moves the "if (last_bss > elf_bss) {" test up to include the
partial-page bss logic as well as the more-pages bss logic.
Reported-by: John Reiser <jreiser@bitwagon.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

752015d1

09 Sep, 2009 3 commits

Linux 2.6.31 · 74fca6a4
Linus Torvalds authored Sep 09, 2009

74fca6a4

aoe: allocate unused request_queue for sysfs · 7135a71b

Ed Cashin authored Sep 09, 2009

Andy Whitcroft reported an oops in aoe triggered by use of an
incorrectly initialised request_queue object:

  [ 2645.959090] kobject '<NULL>' (ffff880059ca22c0): tried to add
		an uninitialized object, something is seriously wrong.
  [ 2645.959104] Pid: 6, comm: events/0 Not tainted 2.6.31-5-generic #24-Ubuntu
  [ 2645.959107] Call Trace:
  [ 2645.959139] [<ffffffff8126ca2f>] kobject_add+0x5f/0x70
  [ 2645.959151] [<ffffffff8125b4ab>] blk_register_queue+0x8b/0xf0
  [ 2645.959155] [<ffffffff8126043f>] add_disk+0x8f/0x160
  [ 2645.959161] [<ffffffffa01673c4>] aoeblk_gdalloc+0x164/0x1c0 [aoe]

The request queue of an aoe device is not used but can be allocated in
code that does not sleep.

Bruno bisected this regression down to

  cd43e26f

  block: Expose stacked device queues in sysfs

"This seems to generate /sys/block/$device/queue and its contents for
 everyone who is using queues, not just for those queues that have a
 non-NULL queue->request_fn."

Addresses http://bugs.launchpad.net/bugs/410198
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13942

Note that embedding a queue inside another object has always been
an illegal construct, since the queues are reference counted and
must persist until the last reference is dropped. So aoe was
always buggy in this respect (Jens).
Signed-off-by: Ed Cashin <ecashin@coraid.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Bruno Premont <bonbons@linux-vserver.org>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

7135a71b

i915: disable interrupts before tearing down GEM state · e6890f6f

Linus Torvalds authored Sep 08, 2009

Reinette Chatre reports a frozen system (with blinking keyboard LEDs)
when switching from graphics mode to the text console, or when
suspending (which does the same thing). With netconsole, the oops
turned out to be

	BUG: unable to handle kernel NULL pointer dereference at 0000000000000084
	IP: [<ffffffffa03ecaab>] i915_driver_irq_handler+0x26b/0xd20 [i915]

and it's due to the i915_gem.c code doing drm_irq_uninstall() after
having done i915_gem_idle(). And the i915_gem_idle() path will do

  i915_gem_idle() ->
    i915_gem_cleanup_ringbuffer() ->
      i915_gem_cleanup_hws() ->
        dev_priv->hw_status_page = NULL;

but if an i915 interrupt comes in after this stage, it may want to
access that hw_status_page, and gets the above NULL pointer dereference.

And since the NULL pointer dereference happens from within an interrupt,
and with the screen still in graphics mode, the common end result is
simply a silently hung machine.

Fix it by simply uninstalling the irq handler before idling rather than
after. Fixes

    http://bugzilla.kernel.org/show_bug.cgi?id=13819Reported-and-tested-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e6890f6f

08 Sep, 2009 9 commits

jffs2/jfs/xfs: switch over to 'check_acl' rather than 'permission()' · 18f4c644

Linus Torvalds authored Aug 28, 2009

This avoids an indirect call in the VFS for each path component lookup.

Well, at least as long as you own the directory in question, and the ACL
check is unnecessary.
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

18f4c644

ext[234]: move over to 'check_acl' permission model · 1d5ccd1c

Linus Torvalds authored Aug 28, 2009

Don't implement per-filesystem 'extX_permission()' functions that have
to be called for every path component operation, and instead just expose
the actual ACL checking so that the VFS layer can now do it for us.
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

1d5ccd1c

shmfs: use 'check_acl' instead of 'permission' · 6d848a48

Linus Torvalds authored Aug 28, 2009

shmfs wants purely standard POSIX ACL semantics, so we can use the new
generic VFS layer POSIX ACL checking rather than cooking our own
'permission()' function.
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

6d848a48

Make 'check_acl()' a first-class filesystem op · 5909ccaa

Linus Torvalds authored Aug 28, 2009

This is stage one in flattening out the callchains for the common
permission testing.  Rather than have most filesystem implement their
own inode->i_op->permission function that just calls back down to the
VFS layers 'generic_permission()' with the per-filesystem ACL checking
function, the filesystem can just expose its 'check_acl' function
directly, and let the VFS layer do everything for it.

This is all just preparatory - no filesystem actually enables this yet.
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

5909ccaa

Simplify exec_permission_lite(), part 3 · cb9179ea

Linus Torvalds authored Aug 28, 2009

Don't call down to the generic inode_permission() function just to
call the inode-specific permission function - just do it directly.

The generic inode_permission() code does things like checking MAY_WRITE
and devcgroup_inode_permission(), neither of which are relevant for the
light pathname walk permission checks (we always do just MAY_EXEC, and
the inode is never a special device).
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cb9179ea

Simplify exec_permission_lite() further · f1ac9f6b

Linus Torvalds authored Aug 28, 2009

This function is only called for path components that are already known
to be directories (they have a '->lookup' method).  So don't bother
doing that whole S_ISDIR() testing, the whole point of the 'lite()'
version is that we know that we are looking at a directory component,
and that we're only checking name lookup permission.
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

f1ac9f6b

Simplify exec_permission_lite() logic · b7a437b0

Linus Torvalds authored Aug 28, 2009

Instead of returning EAGAIN and having the caller do something
special for that case,  just do the special case directly.
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b7a437b0

Do not call 'ima_path_check()' for each path component · e8e66ed2

Linus Torvalds authored Aug 28, 2009

Not only is that a supremely timing-critical path, but it's hopefully
some day going to be lockless for the common case, and ima can't do
that.

Plus the integrity code doesn't even care about non-regular files, so it
was always a total waste of time and effort.
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e8e66ed2

drm/i915: fix mask bits setting · 7c8460db

Zhenyu Wang authored Sep 08, 2009

eDP is exclusive connector too, and add missing crtc_mask
setting for TV.

This fixes

	http://bugzilla.kernel.org/show_bug.cgi?id=14139Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reported-and-tested-by: Carlos R. Mafra <crmafra2@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

7c8460db

07 Sep, 2009 7 commits

Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 · 3ff323f8

Linus Torvalds authored Sep 07, 2009

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: add LTE/GTE discard + rv515 two sided stencil register.

3ff323f8

Merge branch 'for-linus' of... · 755ae761

Linus Torvalds authored Sep 07, 2009

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  IMA: update ima_counts_put

755ae761

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · 4886b5b4
Linus Torvalds authored Sep 07, 2009
```
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  gianfar: Fix build.
```
4886b5b4
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6 · cbeb2864
Linus Torvalds authored Sep 07, 2009
```
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6:
  pcmcia: add CNF-CDROM-ID for ide
```
cbeb2864

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel · f69fb9c3

Linus Torvalds authored Sep 07, 2009

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
  agp/intel: support for new chip variant of IGDNG mobile
  drm/i915: Unref old_obj on get_fence_reg() error path
  drm/i915: increase default latency constant (v2 w/comment)

f69fb9c3

drm/radeon/kms: add LTE/GTE discard + rv515 two sided stencil register. · a54775c8

Dave Airlie authored Sep 07, 2009

This adds some rv350+ register for LTE/GTE discard,
and enables the rv515 two sided stencil register.
It also disables the DEPTHXY_OFFSET register which
can be used to workaround the CS checker.
Moves rs690 to proper place in rs600 and uses correct
table on rs600.
Signed-off-by: Dave Airlie <airlied@redhat.com>

a54775c8

IMA: update ima_counts_put · acd0c935

Mimi Zohar authored Sep 04, 2009

- As ima_counts_put() may be called after the inode has been freed,
verify that the inode is not NULL, before dereferencing it.

- Maintain the IMA file counters in may_open() properly, decrementing
any counter increments on subsequent errors.
Reported-by: Ciprian Docan <docan@eden.rutgers.edu>
Reported-by: J.R. Okajima <hooanon05@yahoo.co.jp>
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Acked-by: Eric Paris <eparis@redhat.com
Signed-off-by: James Morris <jmorris@namei.org>

acd0c935

06 Sep, 2009 1 commit

gianfar: Fix build. · d9d8e041

David S. Miller authored Sep 06, 2009

Reported by Michael Guntsche <mike@it-loops.com>

--------------------
Commit
38bddf04 gianfar: gfar_remove needs to call unregister_netdev()

breaks the build of the gianfar driver because "dev" is undefined in
this function. To quickly test rc9 I changed this to priv->ndev but I do
not know if this is the correct one.
--------------------
Signed-off-by: David S. Miller <davem@davemloft.net>

d9d8e041

05 Sep, 2009 8 commits

Linux 2.6.31-rc9 · e07cccf4
Linus Torvalds authored Sep 05, 2009

e07cccf4

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 · f815c335

Linus Torvalds authored Sep 05, 2009

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  firewire: sbp2: fix freeing of unallocated memory
  firewire: ohci: fix Ricoh R5C832, video reception
  firewire: ohci: fix Agere FW643 and multiple cameras
  firewire: core: fix crash in iso resource management

f815c335

powerpc: Fix i8259 interrupt driver kernel crash on ML510 · 74a01180

Roderick Colenbrander authored Sep 03, 2009

This patch fixes a null pointer exception caused by removal of
'ack()' for level interrupts in the Xilinx interrupt driver.  A recent
change to the xilinx interrupt controller removed the ack hook for
level irqs.
Signed-off-by: Roderick Colenbrander <thunderbird2k@gmail.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

74a01180

Merge git://git.infradead.org/~dwmw2/mtd-2.6.31 · 5136a6c0

Linus Torvalds authored Sep 05, 2009

* git://git.infradead.org/~dwmw2/mtd-2.6.31:
  JFFS2: add missing verify buffer allocation/deallocation
  mtd: nftl: fix offset alignments
  mtd: nftl: write support is broken
  mtd: m25p80: fix null pointer dereference bug

5136a6c0

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · e505a8d5

Linus Torvalds authored Sep 05, 2009

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: Allow changing max_sectors_kb above the default 512

e505a8d5

Merge branch 'fix/oxygen' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 · b71b7dc0

Linus Torvalds authored Sep 05, 2009

* 'fix/oxygen' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  sound: oxygen: handle cards with missing EEPROM
  sound: oxygen: fix MCLK rate for 192 kHz playback

b71b7dc0

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · 59430c2f

Linus Torvalds authored Sep 05, 2009

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  tc: Fix unitialized kernel memory leak
  pkt_sched: Revert tasklet_hrtimer changes.
  net: sk_free() should be allowed right after sk_alloc()
  gianfar: gfar_remove needs to call unregister_netdev()
  ipw2200: firmware DMA loading rework

59430c2f

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · e9ee3a54

Linus Torvalds authored Sep 05, 2009

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: skcipher - Fix skcipher_dequeue_givcrypt NULL test

e9ee3a54