Commits · 08a1ddab6df7d3c7b6341774cb1cf4b21b96a214 · nexedi / linux

23 Mar, 2013 15 commits

drbd: consolidate as many updates as possible into one AL transaction · 08a1ddab

Lars Ellenberg authored Mar 19, 2013

Depending on current IO depth, try to consolidate as many updates
as possible into one activity log transaction.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

08a1ddab

lru_cache: introduce lc_get_cumulative() · cbe5e610

Lars Ellenberg authored Mar 22, 2013

New helper to be able to consolidate more updates
into a single transaction.
Without this, we can only grab a single refcount
on an updated element while preparing a transaction.

lc_get_cumulative - like lc_get; also finds to-be-changed elements
  @lc: the lru cache to operate on
  @enr: the label to look up

  Unlike lc_get this also returns the element for @enr, if it is belonging to
  a pending transaction, so the return values are like for lc_get(),
  plus:

  pointer to an element already on the "to_be_changed" list.
	  In this case, the cache was already marked %LC_DIRTY.

  Caller needs to make sure that the pending transaction is completed,
  before proceeding to actually use this element.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>

Fixed up by Jens to export lc_get_cumulative().
Signed-off-by: Jens Axboe <axboe@kernel.dk>

cbe5e610

drbd: queue writes on submitter thread, unless they pass the activity log fastpath · 779b3fe4

Lars Ellenberg authored Mar 19, 2013

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

779b3fe4

drbd: split out some helper functions to drbd_al_begin_io · 6c3c4355

Lars Ellenberg authored Mar 19, 2013

To make the code easier to follow,
use an explicit find_active_resync_extent(),
and add a "nonblock" parameter to _al_get().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

6c3c4355

drbd: split drbd_al_begin_io into fastpath, prepare, and commit · b5bc8e08

Lars Ellenberg authored Mar 19, 2013

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

b5bc8e08

drbd: prepare to queue write requests on a submit worker · 113fef9e

Lars Ellenberg authored Mar 22, 2013

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

113fef9e

drbd: split __drbd_make_request in before and after drbd_al_begin_io · 6d9febe2

Lars Ellenberg authored Mar 19, 2013

This is in preparation to be able to defer requests that need to wait
for an activity log transaction to a submitter workqueue.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

6d9febe2

drbd: drbd_al_being_io: short circuit to reduce latency · ebfd5d8f

Lars Ellenberg authored Mar 19, 2013

A request hitting an already "hot" extent should proceed right away,
even if some other requests need to wait for pending transactions.

Without that short-circuit, several simultaneous make_request contexts
race for committing the transaction, possibly penalizing the innocent.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ebfd5d8f

drbd: Clarify when activity log I/O is delegated to the worker thread · 56392d2f

Lars Ellenberg authored Mar 19, 2013

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

56392d2f

drbd: read meta data early, base on-disk offsets on super block · c04ccaa6

Lars Ellenberg authored Mar 19, 2013

We used to calculate all on-disk meta data offsets, and then compare
the stored offsets, basically treating them as magic numbers.

Now with the activity log striping, the activity log size is no longer
fixed. We need to first read the super block, then base the activity
log and bitmap offsets on the stored offsets/al stripe settings.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c04ccaa6

drbd: mechanically rename la_size to la_size_sect · cccac985

Lars Ellenberg authored Mar 19, 2013

Make it obvious that this value is in units of 512 Byte sectors.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

cccac985

drbd: use the cached meta_dev_idx · 68e41a43

Lars Ellenberg authored Mar 19, 2013

Now we have the cached meta_dev_idx member,
we can get rid of a few rcu_read_lock() sections and rcu_dereference().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

68e41a43

drbd: prepare for new striped layout of activity log · 3a4d4eb3

Lars Ellenberg authored Mar 19, 2013

Introduce two new on-disk meta data fields: al_stripes and al_stripe_size_4k
The intended use case is activity log on RAID 0 or similar.
Logically consecutive transactions will advance their on-disk position
by al_stripe_size_4k 4kB (transaction sized) blocks.

Right now, these are still asserted to be the backward compatible
values al_stripes = 1, al_stripe_size_4k = 8 (which amounts to 32kB).

Also introduce a caching member for meta_dev_idx in the in-core
structure: even though it is initially passed in in the rcu-protected
disk_conf structure, it cannot change without a detach/attach cycle.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

3a4d4eb3

drbd: cleanup ondisk meta data layout calculations and defines · ae8bf312

Lars Ellenberg authored Mar 19, 2013

Add a comment about our meta data layout variants,
and rename a few defines (e.g. MD_RESERVED_SECT -> MD_128MB_SECT)
to make it clear that they are short hand for fixed constants,
and not arbitrarily to be redefined as one may see fit.

Properly pad struct meta_data_on_disk to 4kB,
and initialize to zero not only the first 512 Byte,
but all of it in drbd_md_sync().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ae8bf312

drbd: cleanup bogus assert message · 9114d795

Lars Ellenberg authored Mar 19, 2013

This fixes ASSERT( mdev->state.disk == D_FAILED ) in drivers/block/drbd/drbd_main.c

When we detach from local disk, we let the local refcount hit zero twice.

First, we transition to D_FAILED, so we won't give out new references
to incoming requests; we still may give out *internal* references, though.
Once the refcount hits zero [1] while in D_FAILED, we queue a transition
to D_DISKLESS to our worker. We need to queue it, because we may be in
atomic context when putting the reference.
Once the transition to D_DISKLESS actually happened [2] from worker context,
we don't give out new internal references either.

Between hitting zero the first time [1] and actually transition to
D_DISKLESS [2], there may be a few very short lived internal get/put,
so we may hit zero more than once while being in D_FAILED, or even see a
race where a an internal get_ldev() happened while D_FAILED, but the
corresponding put_ldev() happens just after the transition to D_DISKLESS.

That's why we have the additional test_and_set_bit(GO_DISKLESS,);
and that's why the assert was placed wrong.
Since there was exactly one code path left to drbd_go_diskless(),
and that checks already for D_FAILED, drop that assert,
and fold in the drbd_queue_work().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

9114d795

17 Mar, 2013 4 commits

Linux 3.9-rc3 · a937536b
Linus Torvalds authored Mar 17, 2013

a937536b

perf,x86: fix link failure for non-Intel configs · 6c4d3bc9

David Rientjes authored Mar 17, 2013

Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") introduces a link failure since
perf_restore_debug_store() is only defined for CONFIG_CPU_SUP_INTEL:

	arch/x86/power/built-in.o: In function `restore_processor_state':
	(.text+0x45c): undefined reference to `perf_restore_debug_store'

Fix it by defining the dummy function appropriately.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

6c4d3bc9

perf,x86: fix wrmsr_on_cpu() warning on suspend/resume · 2a6e06b2

Linus Torvalds authored Mar 17, 2013

Commit 1d9d8639 ("perf,x86: fix kernel crash with PEBS/BTS after
suspend/resume") fixed a crash when doing PEBS performance profiling
after resuming, but in using init_debug_store_on_cpu() to restore the
DS_AREA mtrr it also resulted in a new WARN_ON() triggering.

init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU
cross-calls to do the MSR update. Which is not really valid at the
early resume stage, and the warning is quite reasonable. Now, it all
happens to _work_, for the simple reason that smp_call_function_single()
ends up just doing the call directly on the CPU when the CPU number
matches, but we really should just do the wrmsr() directly instead.

This duplicates the wrmsr() logic, but hopefully we can just remove the
wrmsr_on_cpu() version eventually.
Reported-and-tested-by: Parag Warudkar <parag.lkml@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

2a6e06b2

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 08637024

Linus Torvalds authored Mar 17, 2013

Pull btrfs fixes from Chris Mason:
 "Eric's rcu barrier patch fixes a long standing problem with our
  unmount code hanging on to devices in workqueue helpers.  Liu Bo
  nailed down a difficult assertion for in-memory extent mappings."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix warning of free_extent_map
  Btrfs: fix warning when creating snapshots
  Btrfs: return as soon as possible when edquot happens
  Btrfs: return EIO if we have extent tree corruption
  btrfs: use rcu_barrier() to wait for bdev puts at unmount
  Btrfs: remove btrfs_try_spin_lock
  Btrfs: get better concurrency for snapshot-aware defrag work

08637024

16 Mar, 2013 8 commits

Btrfs: fix warning of free_extent_map · 3b277594

Liu Bo authored Mar 15, 2013

Users report that an extent map's list is still linked when it's actually
going to be freed from cache.

The story is that

a) when we're going to drop an extent map and may split this large one into
smaller ems, and if this large one is flagged as EXTENT_FLAG_LOGGING which means
that it's on the list to be logged, then the smaller ems split from it will also
be flagged as EXTENT_FLAG_LOGGING, and this is _not_ expected.

b) we'll keep ems from unlinking the list and freeing when they are flagged with
EXTENT_FLAG_LOGGING, because the log code holds one reference.

The end result is the warning, but the truth is that we set the flag
EXTENT_FLAG_LOGGING only during fsync.

So clear flag EXTENT_FLAG_LOGGING for extent maps split from a large one.
Reported-by: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>
Reported-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

3b277594

Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · e2043785

Linus Torvalds authored Mar 15, 2013

Pull kbuild fix from Michal Marek:
 "One fix for for make headers_install/headers_check to not require make
  3.81.  The requirement has been accidentally introduced in 3.7."

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kbuild: fix make headers_check with make 3.80

e2043785

Merge tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux · 23659587

Linus Torvalds authored Mar 15, 2013

Pull OpenRISC bug fixes from Jonas Bonn:

 - The GPIO descriptor work has exposed how broken the non-GPIOLIB bits
   for OpenRISC were.  We now require GPIOLIB as this is the preferred
   way forward.

 - The system.h split introduced a bug in llist.h for arches using
   asm-generic/cmpxchg.h directly, which is currently only OpenRISC.
   The patch here moves two defines from asm-generic/atomic.h to
   asm-generic/cmpxchg.h to make things work as they should.

 - The VIRT_TO_BUS selector was added for OpenRISC, but OpenRISC does
   not have the virt_to_bus methods, so there's a patch to remove it
   again.

* tag 'for-3.9-rc3' of git://openrisc.net/jonas/linux:
  openrisc: remove HAVE_VIRT_TO_BUS
  asm-generic: move cmpxchg*_local defs to cmpxchg.h
  openrisc: require gpiolib

23659587

Merge tag 'char-misc-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 9e1a0aab

Linus Torvalds authored Mar 15, 2013

Pull char/misc fixes from Greg Kroah-Hartman:
 "Here are some tiny fixes for the w1 drivers and the final removal
  patch for getting rid of CONFIG_EXPERIMENTAL (all users of it are now
  gone from your tree, this just drops the Kconfig item itself.)

  All have been in the linux-next tree for a while"

* tag 'char-misc-3.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  final removal of CONFIG_EXPERIMENTAL
  w1: fix oops when w1_search is called from netlink connector
  w1-gpio: fix unused variable warning
  w1-gpio: remove erroneous __exit and __exit_p()
  ARM: w1-gpio: fix erroneous gpio requests

9e1a0aab

Merge tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 5cd8846c

Linus Torvalds authored Mar 15, 2013

Pull sound fixes from Takashi Iwai:
 "A collection of small fixes, as expected for the middle rc:
   - A couple of fixes for potential NULL dereferences and out-of-range
     array accesses revealed by static code parsers
   - A fix for the wrong error handling detected by trinity
   - A regression fix for missing audio on some MacBooks
   - CA0132 DSP loader fixes
   - Fix for EAPD control of IDT codecs on machines w/o speaker
   - Fix a regression in the HD-audio widget list parser code
   - Workaround for the NuForce UDH-100 USB audio"

* tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - Fix missing EAPD/GPIO setup for Cirrus codecs
  sound: sequencer: cap array index in seq_chn_common_event()
  ALSA: hda/ca0132 - Remove extra setting of dsp_state.
  ALSA: hda/ca0132 - Check download state of DSP.
  ALSA: hda/ca0132 - Check if dspload_image succeeded.
  ALSA: hda - Disable IDT eapd_switch if there are no internal speakers
  ALSA: hda - Fix snd_hda_get_num_raw_conns() to return a correct value
  ALSA: usb-audio: add a workaround for the NuForce UDH-100
  ALSA: asihpi - fix potential NULL pointer dereference
  ALSA: seq: Fix missing error handling in snd_seq_timer_open()

5cd8846c

Merge branch 'fixes-for-3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping · c7f17deb

Linus Torvalds authored Mar 15, 2013

Pull DMA-mapping fix from Marek Szyprowski:
 "An important fix for all ARM architectures which use ZONE_DMA.
  Without it dma_alloc_* calls with GFP_ATOMIC flag might have allocated
  buffers outsize DMA zone."

* 'fixes-for-3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: DMA-mapping: add missing GFP_DMA flag for atomic buffer allocation

c7f17deb

Merge tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes · de1893f6

Linus Torvalds authored Mar 15, 2013

Pull MFD fixes from Samuel Ortiz:
 "This is the first batch of MFD fixes for 3.9.

  With this one we have:

   - An ab8500 build failure fix.
   - An ab8500 device tree parsing fix.
   - A fix for twl4030_madc remove routine to work properly (when
     built-in).
   - A fix for properly registering palmas interrupt handler.
   - A fix for omap-usb init routine to actually write into the
     hostconfig register.
   - A couple of warning fixes for ab8500-gpadc and tps65912"

* tag 'mfd-fixes-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
  mfd: twl4030-madc: Remove __exit_p annotation
  mfd: ab8500: Kill "reg" property from binding
  mfd: ab8500-gpadc: Complain if we fail to enable vtvout LDO
  mfd: wm831x: Don't forward declare enum wm831x_auxadc
  mfd: twl4030-audio: Fix argument type for twl4030_audio_disable_resource()
  mfd: tps65912: Declare and use tps65912_irq_exit()
  mfd: palmas: Provide irq flags through DT/platform data
  mfd: Make AB8500_CORE select POWER_SUPPLY to fix build error
  mfd: omap-usb-host: Actually update hostconfig

de1893f6

Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 92fbb1c9

Linus Torvalds authored Mar 15, 2013

Pull hwmon fixes from Guenter Roeck:
 "Bug fixes for pmbus, ltc2978, and lineage-pem drivers

  Added specific maintainer for some hwmon drivers"

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus/ltc2978) Fix temperature reporting
  hwmon: (pmbus) Fix krealloc() misuse in pmbus_add_attribute()
  hwmon: (lineage-pem) Add missing terminating entry for pem_[input|fan]_attributes
  MAINTAINERS: Add maintainer for MAX6697, INA209, and INA2XX drivers

92fbb1c9

15 Mar, 2013 8 commits

perf,x86: fix kernel crash with PEBS/BTS after suspend/resume · 1d9d8639

Stephane Eranian authored Mar 15, 2013

This patch fixes a kernel crash when using precise sampling (PEBS)
after a suspend/resume. Turns out the CPU notifier code is not invoked
on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly
by the kernel and keeps it power-on/resume value of 0 causing any PEBS
measurement to crash when running on CPU0.

The workaround is to add a hook in the actual resume code to restore
the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0,
the DS_AREA will be restored twice but this is harmless.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

1d9d8639

ALSA: hda - Fix missing EAPD/GPIO setup for Cirrus codecs · 6d3073e1

Takashi Iwai authored Mar 15, 2013

During the transition to the generic parser, the hook to the codec
specific automute function was forgotten.  This resulted in the silent
output on some MacBooks.
Signed-off-by: Takashi Iwai <tiwai@suse.de>

6d3073e1

sound: sequencer: cap array index in seq_chn_common_event() · 57220bc1

Dan Carpenter authored Mar 15, 2013

"chn" here is a number between 0 and 255, but ->chn_info[] only has
16 elements so there is a potential write beyond the end of the
array.

If the seq_mode isn't SEQ_2 then we let the individual drivers
(either opl3.c or midi_synth.c) handle it.  Those functions all
do a bounds check on "chn" so I haven't changed anything here.
The opl3.c driver has up to 18 channels and not 16.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

57220bc1

mfd: twl4030-madc: Remove __exit_p annotation · 03715410

Arnd Bergmann authored Mar 14, 2013

4740f73f "mfd: remove use of __devexit" removed the __devexit annotation
on the twl4030_madc_remove function, but left an __exit_p() present on the
pointer to this function. Using __exit_p was as wrong with the devexit in
place as it is now, but now we get a gcc warning about an unused function.

In order for the twl4030_madc_remove to work correctly in built-in code, we
have to remove the __exit_p.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>

03715410

ALSA: hda/ca0132 - Remove extra setting of dsp_state. · b714a710

Dylan Reid authored Mar 14, 2013

spec->dsp_state is initialized to DSP_DOWNLOAD_INIT, no need to reset
and check it in ca0132_download_dsp().
Signed-off-by: Dylan Reid <dgreid@chromium.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

b714a710

ALSA: hda/ca0132 - Check download state of DSP. · e8f1bd5d

Dylan Reid authored Mar 14, 2013

Instead of using the dspload_is_loaded() function, check the dsp_state
that is kept in the spec.  The dspload_is_loaded() function returns
true if the DSP transfer was never started.  This false-positive leads
to multiple second delays when ca0132_setup_efaults() times out on
each write.
Signed-off-by: Dylan Reid <dgreid@chromium.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

e8f1bd5d

ALSA: hda/ca0132 - Check if dspload_image succeeded. · d1d28500

Dylan Reid authored Mar 14, 2013

If dspload_image() fails, it was ignored and dspload_wait_loaded() was
still called.  dsp_loaded should never be set to true in this case,
skip it.  The check in dspload_wait_loaded() return true if the DSP is
loaded or if it never started.
Signed-off-by: Dylan Reid <dgreid@chromium.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

d1d28500

mm/fremap.c: fix possible oops on error path · a2362d24

Michel Lespinasse authored Mar 14, 2013

The vm_flags introduced in 6d7825b1 ("mm/fremap.c: fix oops on error
path") is supposed to avoid a compiler warning about unitialized
vm_flags without changing the generated code.

However I am concerned that this is going to be very brittle, and fail
with some compiler versions. The failure could be either of:

- compiler could actually load vma->vm_flags before checking for the
  !vma condition, thus reintroducing the oops

- compiler could optimize out the !vma check, since the pointer just got
  dereferenced shortly before (so the compiler knows it can't be NULL!)

I propose reversing this part of the change and initializing vm_flags to 0
just to avoid the bogus uninitialized use warning.
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a2362d24

14 Mar, 2013 5 commits

Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · f4846e52

Linus Torvalds authored Mar 14, 2013

Pull fix for hlist_entry_safe() regression from Paul McKenney:
 "This contains a single commit that fixes a regression in
  hlist_entry_safe().  This macro references its argument twice, which
  can cause NULL-pointer errors.  This commit applies a gcc statement
  expression, creating a temporary variable to avoid the double
  reference.  This has been posted to LKML at

    https://lkml.org/lkml/2013/3/9/75.

  Kudos to CAI Qian, whose testing uncovered this, to Eric Dumazet, who
  spotted root cause, and to Li Zefan, who tested this commit."

* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  list: Fix double fetch of pointer in hlist_entry_safe()

f4846e52

list: Fix double fetch of pointer in hlist_entry_safe() · f65846a1

Paul E. McKenney authored Mar 09, 2013

The current version of hlist_entry_safe() fetches the pointer twice,
once to test for NULL and the other to compute the offset back to the
enclosing structure.  This is OK for normal lock-based use because in
that case, the pointer cannot change.  However, when the pointer is
protected by RCU (as in "rcu_dereference(p)"), then the pointer can
change at any time.  This use case can result in the following sequence
of events:

1.	CPU 0 invokes hlist_entry_safe(), fetches the RCU-protected
	pointer as sees that it is non-NULL.

2.	CPU 1 invokes hlist_del_rcu(), deleting the entry that CPU 0
	just fetched a pointer to.  Because this is the last entry
	in the list, the pointer fetched by CPU 0 is now NULL.

3.	CPU 0 refetches the pointer, obtains NULL, and then gets a
	NULL-pointer crash.

This commit therefore applies gcc's "({ })" statement expression to
create a temporary variable so that the specified pointer is fetched
only once, avoiding the above sequence of events.  Please note that
it is the caller's responsibility to use rcu_dereference() as needed.
This allows RCU-protected uses to work correctly without imposing
any additional overhead on the non-RCU case.

Many thanks to Eric Dumazet for spotting root cause!
Reported-by: CAI Qian <caiqian@redhat.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Li Zefan <lizefan@huawei.com>

f65846a1

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 40e4591d

Linus Torvalds authored Mar 14, 2013

Pull ext2, ext3, reiserfs, quota fixes from Jan Kara:
 "A fix for regression in ext2, and a format string issue in ext3.  The
  rest isn't too serious."

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  ext2: Fix BUG_ON in evict() on inode deletion
  reiserfs: Use kstrdup instead of kmalloc/strcpy
  ext3: Fix format string issues
  quota: add missing use of dq_data_lock in __dquot_initialize

40e4591d

Btrfs: fix warning when creating snapshots · 7c2ec3f0

Liu Bo authored Mar 13, 2013

Creating snapshot passes extent_root to commit its transaction,
but it can lead to the warning of checking root for quota in
the __btrfs_end_transaction() when someone else is committing
the current transaction.  Since we've recorded the needed root
in trans_handle, just use it to get rid of the warning.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

7c2ec3f0

Btrfs: return as soon as possible when edquot happens · 720f1e20

Wang Shilong authored Mar 06, 2013

If one of qgroup fails to reserve firstly, we should return immediately,
it is unnecessary to continue check.
Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>

720f1e20