Commits · f23f5be0ed87f1c53420086c936385d0a80c7e4d · Kirill Smelkov / linux

11 Sep, 2012 1 commit

ARM: shmobile: sh73a0: enable PMU(Performance Monitoring Unit) · f23f5be0

Tetsuyuki Kobayashi authored Sep 06, 2012

This patch enables PMU(Performance Monitoring Unit) for sh73a0.
Signed-off-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Signed-off-by: Simon Horman <horms@verge.net.au>

f23f5be0

23 Aug, 2012 9 commits

ARM: perf: move irq registration into pmu implementation · 051f1b13

Sudeep KarkadaNagesha authored Jul 31, 2012

This patch moves the CPU-specific IRQ registration and parsing code into
the CPU PMU backend. This is required because a PMU may have more than
one interrupt, which in turn can be either PPI (per-cpu) or SPI
(requiring strict affinity setting at the interrupt distributor).
Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
[will: cosmetic edits and reworked interrupt dispatching]
Signed-off-by: Will Deacon <will.deacon@arm.com>

051f1b13

ARM: perf: move CPU-specific PMU handling code into separate file · 5505b206

Will Deacon authored Jul 29, 2012

This patch moves the CPU-specific PMU handling code out of perf_event.c
and into perf_event_cpu.c.
Signed-off-by: Will Deacon <will.deacon@arm.com>

5505b206

ARM: perf: prepare for moving CPU PMU code into separate file · 6dbc0029

Will Deacon authored Jul 29, 2012

The CPU PMU code is tightly coupled with generic ARM PMU handling code.
This makes it cumbersome when trying to add support for other ARM PMUs
(e.g. interconnect, L2 cache controller, bus) as the generic parts of
the code are not readily reusable.

This patch cleans up perf_event.c so that reusable code is exposed via
header files to other potential PMU drivers. The CPU code is
consistently named to identify it as such and also to prepare for moving
it into a separate file.
Signed-off-by: Will Deacon <will.deacon@arm.com>

6dbc0029

ARM: perf: probe devicetree in preference to current CPU · 04236f9f

Will Deacon authored Jul 28, 2012

The CPU PMU is probed using the current cpuid information as part of the
early_initcall initialising the architecture perf backend. For
architectures without NMI (such as ARM), this does not need to be
performed early and can be deferred to the driver probe callback. This
also allows us to probe the devicetree in preference to parsing the
current cpuid, which may be invalid on a big.LITTLE multi-cluster
system.

This patch defers the PMU probing and uses the devicetree information
when available.
Signed-off-by: Will Deacon <will.deacon@arm.com>

04236f9f

ARM: perf: remove mysterious compiler barrier · 9f44f9a2

Will Deacon authored Jul 28, 2012

There's a rather strange compiler barrier in the PMU disabling code
which was presumably placed there by aliens. There's no valid reason for
the barrier and one can only suspect that it's up to no good.

This patch removes it before it has a chance to spread.
Signed-off-by: Will Deacon <will.deacon@arm.com>

9f44f9a2

ARM: pmu: remove arm_pmu_type enumeration · df3d17e0

Sudeep KarkadaNagesha authored Jul 19, 2012

The arm_pmu_type enumeration was initially introduced to identify
different PMU types in the system, the usual one being that on the CPU
(ARM_PMU_DEVICE_CPU). With the removal of the PMU reservation code and
the introduction of devicetree bindings for the CPU PMU, the enumeration
is no longer required.

This patch removes the enumeration and updates the various CPU PMU
platform devices so that they no longer pass an .id field referring
to identify the PMU type.

Cc: Haojian Zhuang <haojian.zhuang@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Pawel Moll <pawel.moll@arm.com>
Acked-by: Jon Hunter <jon-hunter@ti.com>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiandong Zheng <jdzheng@broadcom.com>
Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
[will: cosmetic edits and actual removal of the enum type]
Signed-off-by: Will Deacon <will.deacon@arm.com>

df3d17e0

ARM: pmu: remove unused reservation mechanism · f0d1bc47

Will Deacon authored Jul 28, 2012

The PMU reservation mechanism was originally intended to allow OProfile
and perf-events to co-ordinate over access to the CPU PMU. Since then,
OProfile for ARM has moved to using perf as its backend, so the
reservation code is no longer used.

This patch removes the reservation code for the CPU PMU on ARM.
Signed-off-by: Will Deacon <will.deacon@arm.com>

f0d1bc47

ARM: perf: add devicetree bindings for 11MPcore, A5, A7 and A15 PMUs · 50243efd

Will Deacon authored Jul 28, 2012

This patch adds separate devicetree bindings for 11MPcore and
Cortex-{A5,A7,A15} PMUs in preparation for improved devicetree parsing
in the ARM perf-event CPU PMU driver.

Cc: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

50243efd

ARM: PMU: Add runtime PM Support · 7be2958e

Jon Hunter authored May 31, 2012

Add runtime PM support to the ARM PMU driver so that devices such as OMAP
supporting dynamic PM can use the platform->runtime_* hooks to initialise
hardware at runtime. Without having these runtime PM hooks in place any
configuration of the PMU hardware would be lost when low power states are
entered and hence would prevent PMU from working.

This change also replaces the PMU platform functions enable_irq and disable_irq
added by Ming Lei with runtime_resume and runtime_suspend funtions. Ming had
added the enable_irq and disable_irq functions as a method to configure the
cross trigger interface on OMAP4 for routing the PMU interrupts. By adding
runtime PM support, we can move the code called by enable_irq and disable_irq
into the runtime PM callbacks runtime_resume and runtime_suspend.

Cc: Ming Lei <ming.lei@canonical.com>
Cc: Benoit Cousson <b-cousson@ti.com>
Cc: Paul Walmsley <paul@pwsan.com>
Cc: Kevin Hilman <khilman@ti.com>
Signed-off-by: Jon Hunter <jon-hunter@ti.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

7be2958e

22 Aug, 2012 18 commits

Linux 3.6-rc3 · fea7a08a
Linus Torvalds authored Aug 22, 2012

fea7a08a

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 4ff63e47

Linus Torvalds authored Aug 22, 2012

Pull drm fixes from Dave Airlie:
 "Intel: edid fixes, power consumption fix, s/r fix, haswell fix

  Radeon: BIOS loading fixes for UEFI and Thunderbolt machines, better
  MSAA validation, lockup timeout fixes, modesetting fixes

  One udl dpms fix, one vmwgfx fix, a couple of trivial core changes.

  There is an export added to ACPI as part of the radeon bios fixes.

  I've also included the fbcon flashing cursor vs deinit race fix, that
  seems the simplest place to start"

Trivial conflict in drivers/video/console/fbcon.c due to me having
already applied the fbcon flashing cursor vs deinit race fix, and Dave
had added a comment in there too.

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (22 commits)
  fbcon: fix race condition between console lock and cursor timer (v1.1)
  drm: Add missing static storage class specifiers in drm_proc.c file
  drm/udl: dpms off the crtc when disabled.
  drm: Remove two unused fields from struct drm_display_mode
  drm: stop vmgfx driver explosion
  drm/radeon/ss: use num_crtc rather than hardcoded 6
  Revert "drm/radeon: fix bo creation retry path"
  drm/i915: use hsw rps tuning values everywhere on gen6+
  drm/radeon: split ATRM support out from the ATPX handler (v3)
  drm/radeon: convert radeon vfct code to use acpi_get_table_with_size
  ACPI: export symbol acpi_get_table_with_size
  drm/radeon: implement ACPI VFCT vbios fetch (v3)
  drm/radeon/kms: extend the Fujitsu D3003-S2 board connector quirk to cover later silicon stepping
  drm/radeon: fix checking of MSAA renderbuffers on r600-r700
  drm/radeon: allow CMASK and FMASK in the CS checker on r600-r700
  drm/radeon: init lockup timeout on ring init
  drm/radeon: avoid turning off spread spectrum for used pll
  drm/i915: fall back to bit-banging if GMBUS fails in CRT EDID reads
  drm/i915: extract connector update from intel_ddc_get_modes() for reuse
  drm/i915: fix hsw uncached pte
  ...

4ff63e47

Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 09236994

Linus Torvalds authored Aug 22, 2012

Pull SCSI target fixes from Nicholas Bellinger:
 "The executive summary includes:

   - Post-merge review comments for tcm_vhost (MST + nab)
   - Avoid debugging overhead when not debugging for tcm-fc(FCoE) (MDR)
   - Fix NULL pointer dereference bug on alloc_page failulre (Yi Zou)
   - Fix REPORT_LUNs regression bug with pSCSI export (AlexE + nab)
   - Fix regression bug with handling of zero-length data CDBs (nab)
   - Fix vhost_scsi_target structure alignment (MST)

  Thanks again to everyone who contributed a bugfix patch, gave review
  feedback on tcm_vhost code, and/or reported a bug during their own
  testing over the last weeks.

  There is one other outstanding bug reported by Roland recently related
  to SCSI transfer length overflow handling, for which the current
  proposed bugfix has been left in queue pending further testing with
  other non iscsi-target based fabric drivers.

  As the patch is verified with loopback (local SGL memory from SCSI
  LLD) + tcm_qla2xxx (TCM allocated SGL memory mapped to PCI HW) fabric
  ports, it will be included into the next 3.6-rc-fixes PULL request."

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Remove unused se_cmd.cmd_spdtl
  tcm_fc: rcu_deref outside rcu lock/unlock section
  tcm_vhost: Fix vhost_scsi_target structure alignment
  target: Fix regression bug with handling of zero-length data CDBs
  target/pscsi: Fix bug with REPORT_LUNs handling for SCSI passthrough
  tcm_vhost: Change vhost_scsi_target->vhost_wwpn to char *
  target: fix NULL pointer dereference bug alloc_page() fails to get memory
  tcm_fc: Avoid debug overhead when not debugging
  tcm_vhost: Post-merge review changes requested by MST
  tcm_vhost: Fix incorrect IS_ERR() usage in vhost_scsi_map_iov_to_sgl

09236994

Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux · 2e2d8c93

Linus Torvalds authored Aug 22, 2012

Pull i2c-embedded fixes from Wolfram Sang:
 "Some bugfixes for the "embedded" part of the I2C subsystem.  The fixes
  affect mostly drivers which have been largely reworked lately and
  where regressions appeared."

* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
  i2c: tegra: protect suspend/resume callbacks with CONFIG_PM_SLEEP
  i2c: diolan-u2c: Fix master_xfer return code
  I2C: OMAP: xfer: fix runtime PM get/put balance on error
  i2c: nomadik: Add default configuration into the Nomadik I2C driver

2e2d8c93

Merge tag 'for-3.6-rc3' of git://gitorious.org/linux-pwm/linux-pwm · fec3c03f

Linus Torvalds authored Aug 22, 2012

Pull pwm fixes from Thierry Reding:
 "These patches fix the Samsung PWM driver and perform some minor
  cleanups like fixing checkpatch and sparse warnings.

  Two redundant error messages are removed and the Kconfig help text for
  the PWM subsystem is made more descriptive."

* tag 'for-3.6-rc3' of git://gitorious.org/linux-pwm/linux-pwm:
  pwm: Improve Kconfig help text
  pwm: core: Fix coding style issues
  pwm: vt8500: Fix coding style issue
  pwm: Remove a redundant error message when devm_request_and_ioremap fails
  pwm: samsung: add missing device pointer to struct pwm_chip
  pwm: Add missing static storage class specifiers in core.c file

fec3c03f

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · f753c4ec

Linus Torvalds authored Aug 22, 2012

Pull ceph fixes from Sage Weil:
 "Jim's fix closes a narrow race introduced with the msgr changes.  One
  fix resolves problems with debugfs initialization that Yan found when
  multiple client instances are created (e.g., two clusters mounted, or
  rbd + cephfs), another one fixes problems with mounting a nonexistent
  server subdirectory, and the last one fixes a divide by zero error
  from unsanitized ioctl input that Dan Carpenter found."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: avoid divide by zero in __validate_layout()
  libceph: avoid truncation due to racing banners
  ceph: tolerate (and warn on) extraneous dentry from mds
  libceph: delay debugfs initialization until we learn global_id

f753c4ec

Merge tag 'nfs-for-3.6-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · ad746be9

Linus Torvalds authored Aug 22, 2012

Pull NFS client bugfixes from Trond Myklebust:
 - NFSv3 mounts need to fail if the FSINFO rpc call fails
 - Ensure that the NFS commit cache gets torn down when we unload the
   NFS module.
 - Fix memory scribble issues when interrupting a LAYOUTGET rpc call
 - Fix NFSv4 legacy idmapper regressions
 - Fix issues with the NFSv4 getacl command
 - Fix a regression when using the legacy "mount -t nfs4"

* tag 'nfs-for-3.6-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv3: Ensure that do_proc_get_root() reports errors correctly
  NFSv4: Ensure that nfs4_alloc_client cleans up on error.
  NFS: return -ENOKEY when the upcall fails to map the name
  NFS: Clear key construction data if the idmap upcall fails
  NFSv4: Don't use private xdr_stream fields in decode_getacl
  NFSv4: Fix the acl cache size calculation
  NFSv4: Fix pointer arithmetic in decode_getacl
  NFS: Alias the nfs module to nfs4
  NFS: Fix a regression when loading the NFS v4 module
  NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done
  pnfs-obj: Better IO pattern in case of unaligned offset
  NFS41: add pg_layout_private to nfs_pageio_descriptor
  pnfs: nfs4_proc_layoutget returns void
  pnfs: defer release of pages in layoutget
  nfs: tear down caches in nfs_init_writepagecache when allocation fails

ad746be9

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 467e9e51

Linus Torvalds authored Aug 22, 2012

Pull assorted fixes - mostly vfs - from Al Viro:
 "Assorted fixes, with an unexpected detour into vfio refcounting logics
  (fell out when digging in an analog of eventpoll race in there)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  task_work: add a scheduling point in task_work_run()
  fs: fix fs/namei.c kernel-doc warnings
  eventpoll: use-after-possible-free in epoll_create1()
  vfio: grab vfio_device reference *before* exposing the sucker via fd_install()
  vfio: get rid of vfio_device_put()/vfio_group_get_device* races
  vfio: get rid of open-coding kref_put_mutex
  introduce kref_put_mutex()
  vfio: don't dereference after kfree...
  mqueue: lift mnt_want_write() outside ->i_mutex, clean up a bit

467e9e51

task_work: add a scheduling point in task_work_run() · 88ec2789

Eric Dumazet authored Aug 21, 2012

It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced
the problem addressed in commit 944be0b2 (close_files(): add scheduling
point)

If a server process with a lot of files (say 2 million tcp sockets)
is killed, we can spend a lot of time in task_work_run() and trigger
a soft lockup.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

88ec2789

fs: fix fs/namei.c kernel-doc warnings · 55852635

Randy Dunlap authored Aug 18, 2012

Fix kernel-doc warnings in fs/namei.c:

Warning(fs/namei.c:360): No description found for parameter 'inode'
Warning(fs/namei.c:672): No description found for parameter 'nd'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc:	Alexander Viro <viro@zeniv.linux.org.uk>
Cc:	linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

55852635

eventpoll: use-after-possible-free in epoll_create1() · 98022748

Al Viro authored Aug 17, 2012

As soon as we'd installed the file into descriptor table, it can
get closed by another thread.  Freeing ep in process...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

98022748

vfio: grab vfio_device reference *before* exposing the sucker via fd_install() · 31605deb

Al Viro authored Aug 17, 2012

It's not critical (anymore) since another thread closing the file will block
on ->device_lock before it gets to dropping the final reference, but it's
definitely cleaner that way...
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

31605deb

vfio: get rid of vfio_device_put()/vfio_group_get_device* races · 90b1253e

Al Viro authored Aug 17, 2012

we really need to make sure that dropping the last reference happens
under the group->device_lock; otherwise a loop (under device_lock)
might find vfio_device instance that is being freed right now, has
already dropped the last reference and waits on device_lock to exclude
the sucker from the list.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

90b1253e

vfio: get rid of open-coding kref_put_mutex · 6d2cd3ce

Al Viro authored Aug 17, 2012

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

6d2cd3ce

introduce kref_put_mutex() · 8ad5db8a

Al Viro authored Aug 17, 2012

equivalent of
	mutex_lock(mutex);
	if (!kref_put(kref, release))
		mutex_unlock(mutex);
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

8ad5db8a

vfio: don't dereference after kfree... · 934ad4c2

Al Viro authored Aug 17, 2012

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

934ad4c2

fbcon: fix race condition between console lock and cursor timer (v1.1) · d8636a27

Dave Airlie authored Aug 21, 2012

So we've had a fair few reports of fbcon handover breakage between
efi/vesafb and i915 surface recently, so I dedicated a couple of
days to finding the problem.

Essentially the last thing we saw was the conflicting framebuffer
message and that was all.

So after much tracing with direct netconsole writes (printks
under console_lock not so useful), I think I found the race.

Thread A (driver load)    Thread B (timer thread)
  unbind_con_driver ->              |
  bind_con_driver ->                |
  vc->vc_sw->con_deinit ->          |
  fbcon_deinit ->                   |
  console_lock()                    |
      |                             |
      |                       fbcon_flashcursor timer fires
      |                       console_lock() <- blocked for A
      |
      |
fbcon_del_cursor_timer ->
  del_timer_sync
  (BOOM)

Of course because all of this is under the console lock,
we never see anything, also since we also just unbound the active
console guess what we never see anything.

Hopefully this fixes the problem for anyone seeing vesafb->kms
driver handoff.

v1.1: add comment suggestion from Alan.

Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

d8636a27

Merge branch 'akpm' (Andrew's patch-bomb) · 23dcfa61

Linus Torvalds authored Aug 21, 2012

Merge fixes from Andrew Morton.

Random drivers and some VM fixes.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (17 commits)
  mm: compaction: Abort async compaction if locks are contended or taking too long
  mm: have order > 0 compaction start near a pageblock with free pages
  rapidio/tsi721: fix unused variable compiler warning
  rapidio/tsi721: fix inbound doorbell interrupt handling
  drivers/rtc/rtc-rs5c348.c: fix hour decoding in 12-hour mode
  mm: correct page->pfmemalloc to fix deactivate_slab regression
  drivers/rtc/rtc-pcf2123.c: initialize dynamic sysfs attributes
  mm/compaction.c: fix deferring compaction mistake
  drivers/misc/sgi-xp/xpc_uv.c: SGI XPC fails to load when cpu 0 is out of IRQ resources
  string: do not export memweight() to userspace
  hugetlb: update hugetlbpage.txt
  checkpatch: add control statement test to SINGLE_STATEMENT_DO_WHILE_MACRO
  mm: hugetlbfs: correctly populate shared pmd
  cciss: fix incorrect scsi status reporting
  Documentation: update mount option in filesystem/vfat.txt
  mm: change nr_ptes BUG_ON to WARN_ON
  cs5535-clockevt: typo, it's MFGPT, not MFPGT

23dcfa61

21 Aug, 2012 12 commits

Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · a484147a

Linus Torvalds authored Aug 21, 2012

Pull media fixes from Mauro Carvalho Chehab:
 "For bug fixes, at soc_camera, si470x, uvcvideo, iguanaworks IR driver,
  radio_shark Kbuild fixes, and at the V4L2 core (radio fixes)."

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] media: soc_camera: don't clear pix->sizeimage in JPEG mode
  [media] media: mx2_camera: Fix clock handling for i.MX27
  [media] video: mx2_camera: Use clk_prepare_enable/clk_disable_unprepare
  [media] video: mx1_camera: Use clk_prepare_enable/clk_disable_unprepare
  [media] media: mx3_camera: buf_init() add buffer state check
  [media] radio-shark2: Only compile led support when CONFIG_LED_CLASS is set
  [media] radio-shark: Only compile led support when CONFIG_LED_CLASS is set
  [media] radio-shark*: Call cancel_work_sync from disconnect rather then release
  [media] radio-shark*: Remove work-around for dangling pointer in usb intfdata
  [media] Add USB dependency for IguanaWorks USB IR Transceiver
  [media] Add missing logging for rangelow/high of hwseek
  [media] VIDIOC_ENUM_FREQ_BANDS fix
  [media] mem2mem_testdev: fix querycap regression
  [media] si470x: v4l2-compliance fixes
  [media] DocBook: Remove a spurious character
  [media] uvcvideo: Reset the bytesused field when recycling an erroneous buffer

a484147a

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 8f8ba75e

Linus Torvalds authored Aug 21, 2012

Pull networking update from David Miller:
 "A couple weeks of bug fixing in there.  The largest chunk is all the
  broken crap Amerigo Wang found in the netpoll layer."

 1) netpoll and it's users has several serious bugs:
    a) uses GFP_KERNEL with locks held
    b) interfaces requiring interrupts disabled are called with them
       enabled
    c) and vice versa
    d) VLAN tag demuxing, as per all other RX packet input paths, is not
       applied

    All from Amerigo Wang.

 2) Hopefully cure the ipv4 mapped ipv6 address TCP early demux bugs for
    good, from Neal Cardwell.

 3) Unlike AF_UNIX, AF_PACKET sockets don't set a default credentials
    when the user doesn't specify one explicitly during sendmsg().
    Instead we attach an empty (zero) SCM credential block which is
    definitely not what we want.  Fix from Eric Dumazet.

 4) IPv6 illegally invokes netdevice notifiers with RCU lock held, fix
    from Ben Hutchings.

 5) inet_csk_route_child_sock() checks wrong inet options pointer, fix
    from Christoph Paasch.

 6) When AF_PACKET is used for transmit, packet loopback doesn't behave
    properly when a socket fanout is enabled, from Eric Leblond.

 7) On bluetooth l2cap channel create failure, we leak the socket, from
    Jaganath Kanakkassery.

 8) Fix all the netprio file handling bugs found by Al Viro, from John
    Fastabend.

 9) Several error return and NULL deref bug fixes in networking drivers
    from Julia Lawall.

10) A large smattering of struct padding et al.  kernel memory leaks to
    userspace found of Mathias Krause.

11) Conntrack expections in netfilter can access an uninitialized timer,
    fix from Pablo Neira Ayuso.

12) Several netfilter SIP tracker bug fixes from Patrick McHardy.

13) IPSEC ipv6 routes are not initialized correctly all the time,
    resulting in an OOPS in inet_putpeer().  Also from Patrick McHardy.

14) Bridging does rcu_dereference() outside of RCU protected area, from
    Stephen Hemminger.

15) Fix routing cache removal performance regression when looking up
    output routes that have a local destination.  From Zheng Yan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
  af_netlink: force credentials passing [CVE-2012-3520]
  ipv4: fix ip header ident selection in __ip_make_skb()
  ipv4: Use newinet->inet_opt in inet_csk_route_child_sock()
  tcp: fix possible socket refcount problem
  net: tcp: move sk_rx_dst_set call after tcp_create_openreq_child()
  net/core/dev.c: fix kernel-doc warning
  netconsole: remove a redundant netconsole_target_put()
  net: ipv6: fix oops in inet_putpeer()
  net/stmmac: fix issue of clk_get for Loongson1B.
  caif: Do not dereference NULL in chnl_recv_cb()
  af_packet: don't emit packet on orig fanout group
  drivers/net/irda: fix error return code
  drivers/net/wan/dscc4.c: fix error return code
  drivers/net/wimax/i2400m/fw.c: fix error return code
  smsc75xx: add missing entry to MAINTAINERS
  net: qmi_wwan: new devices: UML290 and K5006-Z
  net: sh_eth: Add eth support for R8A7779 device
  netdev/phy: skip disabled mdio-mux nodes
  dt: introduce for_each_available_child_of_node, of_get_next_available_child
  net: netprio: fix cgrp create and write priomap race
  ...

8f8ba75e

mm: compaction: Abort async compaction if locks are contended or taking too long · c67fe375

Mel Gorman authored Aug 21, 2012

Jim Schutt reported a problem that pointed at compaction contending
heavily on locks.  The workload is straight-forward and in his own words;

	The systems in question have 24 SAS drives spread across 3 HBAs,
	running 24 Ceph OSD instances, one per drive.  FWIW these servers
	are dual-socket Intel 5675 Xeons w/48 GB memory.  I've got ~160
	Ceph Linux clients doing dd simultaneously to a Ceph file system
	backed by 12 of these servers.

Early in the test everything looks fine

  procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
   r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
  31 15          0     287216        576   38606628    0    0     2  1158    2   14   1  3  95  0  0
  27 15          0     225288        576   38583384    0    0    18 2222016 203357 134876  11 56  17 15  0
  28 17          0     219256        576   38544736    0    0    11 2305932 203141 146296  11 49  23 17  0
   6 18          0     215596        576   38552872    0    0     7 2363207 215264 166502  12 45  22 20  0
  22 18          0     226984        576   38596404    0    0     3 2445741 223114 179527  12 43  23 22  0

and then it goes to pot

  procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
   r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
  163  8          0     464308        576   36791368    0    0    11 22210  866  536   3 13  79  4  0
  207 14          0     917752        576   36181928    0    0   712 1345376 134598 47367   7 90   1  2  0
  123 12          0     685516        576   36296148    0    0   429 1386615 158494 60077   8 84   5  3  0
  123 12          0     598572        576   36333728    0    0  1107 1233281 147542 62351   7 84   5  4  0
  622  7          0     660768        576   36118264    0    0   557 1345548 151394 59353   7 85   4  3  0
  223 11          0     283960        576   36463868    0    0    46 1107160 121846 33006   6 93   1  1  0

Note that system CPU usage is very high blocks being written out has
dropped by 42%. He analysed this with perf and found

  perf record -g -a sleep 10
  perf report --sort symbol --call-graph fractal,5
    34.63%  [k] _raw_spin_lock_irqsave
            |
            |--97.30%-- isolate_freepages
            |          compaction_alloc
            |          unmap_and_move
            |          migrate_pages
            |          compact_zone
            |          compact_zone_order
            |          try_to_compact_pages
            |          __alloc_pages_direct_compact
            |          __alloc_pages_slowpath
            |          __alloc_pages_nodemask
            |          alloc_pages_vma
            |          do_huge_pmd_anonymous_page
            |          handle_mm_fault
            |          do_page_fault
            |          page_fault
            |          |
            |          |--87.39%-- skb_copy_datagram_iovec
            |          |          tcp_recvmsg
            |          |          inet_recvmsg
            |          |          sock_recvmsg
            |          |          sys_recvfrom
            |          |          system_call
            |          |          __recv
            |          |          |
            |          |           --100.00%-- (nil)
            |          |
            |           --12.61%-- memcpy
             --2.70%-- [...]

There was other data but primarily it is all showing that compaction is
contended heavily on the zone->lock and zone->lru_lock.

commit [b2eef8c0: mm: compaction: minimise the time IRQs are disabled
while isolating pages for migration] noted that it was possible for
migration to hold the lru_lock for an excessive amount of time. Very
broadly speaking this patch expands the concept.

This patch introduces compact_checklock_irqsave() to check if a lock
is contended or the process needs to be scheduled. If either condition
is true then async compaction is aborted and the caller is informed.
The page allocator will fail a THP allocation if compaction failed due
to contention. This patch also introduces compact_trylock_irqsave()
which will acquire the lock only if it is not contended and the process
does not need to schedule.
Reported-by: Jim Schutt <jaschut@sandia.gov>
Tested-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c67fe375

mm: have order > 0 compaction start near a pageblock with free pages · de74f1cc

Mel Gorman authored Aug 21, 2012

Commit 7db8889a ("mm: have order > 0 compaction start off where it
left") introduced a caching mechanism to reduce the amount work the free
page scanner does in compaction.  However, it has a problem.  Consider
two process simultaneously scanning free pages

					    			C
	Process A		M     S     			F
			|---------------------------------------|
	Process B		M 	FS

	C is zone->compact_cached_free_pfn
	S is cc->start_pfree_pfn
	M is cc->migrate_pfn
	F is cc->free_pfn

In this diagram, Process A has just reached its migrate scanner, wrapped
around and updated compact_cached_free_pfn accordingly.

Simultaneously, Process B finishes isolating in a block and updates
compact_cached_free_pfn again to the location of its free scanner.

Process A moves to "end_of_zone - one_pageblock" and runs this check

                if (cc->order > 0 && (!cc->wrapped ||
                                      zone->compact_cached_free_pfn >
                                      cc->start_free_pfn))
                        pfn = min(pfn, zone->compact_cached_free_pfn);

compact_cached_free_pfn is above where it started so the free scanner
skips almost the entire space it should have scanned.  When there are
multiple processes compacting it can end in a situation where the entire
zone is not being scanned at all.  Further, it is possible for two
processes to ping-pong update to compact_cached_free_pfn which is just
random.

Overall, the end result wrecks allocation success rates.

There is not an obvious way around this problem without introducing new
locking and state so this patch takes a different approach.

First, it gets rid of the skip logic because it's not clear that it
matters if two free scanners happen to be in the same block but with
racing updates it's too easy for it to skip over blocks it should not.

Second, it updates compact_cached_free_pfn in a more limited set of
circumstances.

If a scanner has wrapped, it updates compact_cached_free_pfn to the end
	of the zone. When a wrapped scanner isolates a page, it updates
	compact_cached_free_pfn to point to the highest pageblock it
	can isolate pages from.

If a scanner has not wrapped when it has finished isolated pages it
	checks if compact_cached_free_pfn is pointing to the end of the
	zone. If so, the value is updated to point to the highest
	pageblock that pages were isolated from. This value will not
	be updated again until a free page scanner wraps and resets
	compact_cached_free_pfn.

This is not optimal and it can still race but the compact_cached_free_pfn
will be pointing to or very near a pageblock with free pages.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

de74f1cc

rapidio/tsi721: fix unused variable compiler warning · 9a9a9a7a

Alexandre Bounine authored Aug 21, 2012

Fix unused variable compiler warning when built with CONFIG_RAPIDIO_DEBUG
option off.

This patch is applicable to kernel versions starting from v3.2
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

9a9a9a7a

rapidio/tsi721: fix inbound doorbell interrupt handling · 3670e7e1

Alexandre Bounine authored Aug 21, 2012

Make sure that there is no doorbell messages left behind due to disabled
interrupts during inbound doorbell processing.

The most common case for this bug is loss of rionet JOIN messages in
systems with three or more rionet participants and MSI or MSI-X enabled.
As result, requests for packet transfers may finish with "destination
unreachable" error message.

This patch is applicable to kernel versions starting from v3.2.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

3670e7e1

drivers/rtc/rtc-rs5c348.c: fix hour decoding in 12-hour mode · 7dbfb315

Atsushi Nemoto authored Aug 21, 2012

Correct the offset by subtracting 20 from tm_hour before taking the
modulo 12.

[ "Why 20?" I hear you ask. Or at least I did.

  Here's the reason why: RS5C348_BIT_PM is 32, and is - stupidly -
  included in the RS5C348_HOURS_MASK define.  So it's really subtracting
  out that bit to get "hour+12".  But then because it does things modulo
  12, it needs to add the 12 in again afterwards anyway.

  This code is confused.  It would be much clearer if RS5C348_HOURS_MASK
  just didn't include the RS5C348_BIT_PM bit at all, then it wouldn't
  need to do the silly subtract either.

  Whatever. It's all just math, the end result is the same.   - Linus ]
Reported-by: James Nute <newten82@gmail.com>
Tested-by: James Nute <newten82@gmail.com>
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

7dbfb315

mm: correct page->pfmemalloc to fix deactivate_slab regression · b121186a

Alex Shi authored Aug 21, 2012

Commit cfd19c5a ("mm: only set page->pfmemalloc when
ALLOC_NO_WATERMARKS was used") tried to narrow down page->pfmemalloc
setting, but it missed some places the pfmemalloc should be set.

So, in __slab_alloc, the unalignment pfmemalloc and ALLOC_NO_WATERMARKS
cause incorrect deactivate_slab() on our core2 server:

    64.73%           fio  [kernel.kallsyms]     [k] _raw_spin_lock
                     |
                     --- _raw_spin_lock
                        |
                        |---0.34%-- deactivate_slab
                        |          __slab_alloc
                        |          kmem_cache_alloc
                        |          |

That causes our fio sync write performance to have a 40% regression.

Move the checking in get_page_from_freelist() which resolves this issue.
Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Sage Weil <sage@inktank.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b121186a

drivers/rtc/rtc-pcf2123.c: initialize dynamic sysfs attributes · 5ed12f12

Ilya Shchepetkov authored Aug 21, 2012

Dynamically allocated sysfs attributes must be initialized using
sysfs_attr_init(), otherwise lockdep complains: BUG: key <address> not in
.data!

Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Ilya Shchepetkov <shchepetkov@ispras.ru>
Cc: Chris Verges <chrisv@cyberswitching.com>
Cc: Christian Pellegrin <chripell@fsfe.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

5ed12f12

mm/compaction.c: fix deferring compaction mistake · c81758fb

Minchan Kim authored Aug 21, 2012

Commit aff62249 ("vmscan: only defer compaction for failed order and
higher") fixed bad deferring policy but made mistake about checking
compact_order_failed in __compact_pgdat().  So it can't update
compact_order_failed with the new order.  This ends up preventing
correct operation of policy deferral.  This patch fixes it.
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c81758fb

drivers/misc/sgi-xp/xpc_uv.c: SGI XPC fails to load when cpu 0 is out of IRQ resources · 7838f994

Robin Holt authored Aug 21, 2012

On many of our larger systems, CPU 0 has had all of its IRQ resources
consumed before XPC loads.  Worst cases on machines with multiple 10
GigE cards and multiple IB cards have depleted the entire first socket
of IRQs.

This patch makes selecting the node upon which IRQs are allocated (as
well as all the other GRU Message Queue structures) specifiable as a
module load param and has a default behavior of searching all nodes/cpus
for an available resources.

[akpm@linux-foundation.org: fix build: include cpu.h and module.h]
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

7838f994

string: do not export memweight() to userspace · c3a5ce04

WANG Cong authored Aug 21, 2012

Fix the following warning:

  usr/include/linux/string.h:8: userspace cannot reference function or variable defined in the kernel
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c3a5ce04