1. 26 Sep, 2014 40 commits
    • Jan Kara's avatar
      ext2: Fix fs corruption in ext2_get_xip_mem() · 5163683b
      Jan Kara authored
      Commit 8e3dffc6 "Ext2: mark inode dirty after the function
      dquot_free_block_nodirty is called" unveiled a bug in __ext2_get_block()
      called from ext2_get_xip_mem(). That function called ext2_get_block()
      mistakenly asking it to map 0 blocks while 1 was intended. Before the
      above mentioned commit things worked out fine by luck but after that commit
      we started returning that we allocated 0 blocks while we in fact
      allocated 1 block and thus allocation was looping until all blocks in
      the filesystem were exhausted.
      
      Fix the problem by properly asking for one block and also add assertion
      in ext2_get_blocks() to catch similar problems.
      Reported-and-tested-by: default avatarAndiry Xu <andiry.xu@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      
      (cherry picked from commit 7ba3ec57)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5163683b
    • Paul Gortmaker's avatar
      8250_pci: fix warnings in backport of Broadcom TruManage support · 580d2659
      Paul Gortmaker authored
      commit 7400ce7e (v3.4.92-76-g7400ce7e)
      was a backport of commit ebebd49a upstream
      ("8250/16?50: Add support for Broadcom TruManage redirected serial port")
      
      However, in the context of 3.4.x kernels, the pci setup code was
      expecting a struct uart_port and not a struct uart_8250_port, leading to
      the following concerning warnings:
      
      drivers/tty/serial/8250/8250_pci.c: In function ‘pci_brcm_trumanage_setup’:
      drivers/tty/serial/8250/8250_pci.c:1086:2: warning: passing argument 3 of ‘pci_default_setup’ from incompatible pointer type [enabled by default]
        int ret = pci_default_setup(priv, board, port, idx);
        ^
      drivers/tty/serial/8250/8250_pci.c:1036:1: note: expected ‘struct uart_port *’ but argument is of type ‘struct uart_8250_port *’
       pci_default_setup(struct serial_private *priv,
       ^
      drivers/tty/serial/8250/8250_pci.c: At top level:
      drivers/tty/serial/8250/8250_pci.c:1746:3: warning: initialization from incompatible pointer type [enabled by default]
         .setup  = pci_brcm_trumanage_setup,
         ^
      drivers/tty/serial/8250/8250_pci.c:1746:3: warning: (near initialization for ‘pci_serial_quirks[56].setup’) [enabled by default]
      
      I'd also expect the initialization to not function correctly, and
      perhaps dereference random garbage due to this.  Since the uart_port
      is a field within the uart_8250_port, the adaptation to fix these
      warnings is a straightforward removal of a layer of indirection.
      
      Cc: Stephen Hurd <shurd@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      (cherry picked from commit 82a938ab)
      
      (cherry picked from commit HEAD)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      580d2659
    • Stefan Kristiansson's avatar
      openrisc: add missing header inclusion · 27a615f0
      Stefan Kristiansson authored
      Prevents build issue with updated toolchain
      Reported-by: default avatarJack Thomasson <jkt@moonlitsw.com>
      Tested-by: default avatarChristian Svensson <blue@cmd.nu>
      Signed-off-by: default avatarStefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Signed-off-by: default avatarJonas Bonn <jonas@southpole.se>
      
      (cherry picked from commit 160d8378)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      27a615f0
    • Johan Hovold's avatar
      USB: serial: fix potential heap buffer overflow · e547051b
      Johan Hovold authored
      Make sure to verify the number of ports requested by subdriver to avoid
      writing beyond the end of fixed-size array in interface data.
      
      The current usb-serial implementation is limited to eight ports per
      interface but failed to verify that the number of ports requested by a
      subdriver (which could have been determined from device descriptors) did
      not exceed this limit.
      
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      (cherry picked from commit 5654699f)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e547051b
    • Mark Rutland's avatar
      ARM: 8129/1: errata: work around Cortex-A15 erratum 830321 using dummy strex · a8252b76
      Mark Rutland authored
      On revisions of Cortex-A15 prior to r3p3, a CLREX instruction at PL1 may
      falsely trigger a watchpoint exception, leading to potential data aborts
      during exception return and/or livelock.
      
      This patch resolves the issue in the following ways:
      
        - Replacing our uses of CLREX with a dummy STREX sequence instead (as
          we did for v6 CPUs).
      
        - Removing the clrex code from v7_exit_coherency_flush and derivatives,
          since this only exists as a minor performance improvement when
          non-cached exclusives are in use (Linux doesn't use these).
      
      Benchmarking on a variety of ARM cores revealed no measurable
      performance difference with this change applied, so the change is
      performed unconditionally and no new Kconfig entry is added.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      
      (cherry picked from commit 2c32c65e)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      a8252b76
    • Mark Rutland's avatar
      ARM: 8128/1: abort: don't clear the exclusive monitors · bc4c7e40
      Mark Rutland authored
      The ARMv6 and ARMv7 early abort handlers clear the exclusive monitors
      upon entry to the kernel, but this is redundant:
      
        - We clear the monitors on every exception return since commit
          200b812d ("Clear the exclusive monitor when returning from an
          exception"), so this is not necessary to ensure the monitors are
          cleared before returning from a fault handler.
      
        - Any dummy STREX will target a temporary scratch area in memory, and
          may succeed or fail without corrupting useful data. Its status value
          will not be used.
      
        - Any other STREX in the kernel must be preceded by an LDREX, which
          will initialise the monitors consistently and will not depend on the
          earlier state of the monitors.
      
      Therefore we have no reason to care about the initial state of the
      exclusive monitors when a data abort is taken, and clearing the monitors
      prior to exception return (as we already do) is sufficient.
      
      This patch removes the redundant clearing of the exclusive monitors from
      the early abort handlers.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      
      (cherry picked from commit 85868313)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      bc4c7e40
    • Jiri Kosina's avatar
      HID: magicmouse: sanity check report size in raw_event() callback · 1e2d552a
      Jiri Kosina authored
      The report passed to us from transport driver could potentially be
      arbitrarily large, therefore we better sanity-check it so that
      magicmouse_emit_touch() gets only valid values of raw_id.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarSteven Vittitoe <scvitti@google.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      
      (cherry picked from commit c54def7b)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1e2d552a
    • Stephen Hemminger's avatar
      USB: sisusb: add device id for Magic Control USB video · c1112865
      Stephen Hemminger authored
      I have a j5 create (JUA210) USB 2 video device and adding it device id
      to SIS USB video gets it to work.
      Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      (cherry picked from commit 5b6b80ae)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      c1112865
    • Benjamin Tissoires's avatar
      HID: logitech-dj: prevent false errors to be shown · d4ee521d
      Benjamin Tissoires authored
      Commit "HID: logitech: perform bounds checking on device_id early
      enough" unfortunately leaks some errors to dmesg which are not real
      ones:
      - if the report is not a DJ one, then there is not point in checking
        the device_id
      - the receiver (index 0) can also receive some notifications which
        can be safely ignored given the current implementation
      
      Move out the test regarding the report_id and also discards
      printing errors when the receiver got notified.
      
      Fixes: ad3e14d7
      
      Cc: stable@vger.kernel.org
      Reported-and-tested-by: default avatarMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      
      (cherry picked from commit 5abfe85c)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      d4ee521d
    • Jaša Bartelj's avatar
      USB: ftdi_sio: Added PID for new ekey device · 46f96315
      Jaša Bartelj authored
      Added support to the ftdi_sio driver for ekey Converter USB which
      uses an FT232BM chip.
      Signed-off-by: default avatarJaša Bartelj <jasa.bartelj@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      
      (cherry picked from commit 646907f5)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      46f96315
    • Greg KH's avatar
      USB: serial: pl2303: add device id for ztek device · 5c478599
      Greg KH authored
      This adds a new device id to the pl2303 driver for the ZTEK device.
      Reported-by: default avatarMike Chu <Mike-Chu@prolific.com.tw>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      
      (cherry picked from commit 91fcb1ce)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5c478599
    • Brennan Ashton's avatar
      USB: option: add VIA Telecom CDS7 chipset device id · 4c1a1c23
      Brennan Ashton authored
      This VIA Telecom baseband processor is used is used by by u-blox in both the
      FW2770 and FW2760 products and may be used in others as well.
      
      This patch has been tested on both of these modem versions.
      Signed-off-by: default avatarBrennan Ashton <bashton@brennanashton.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      
      (cherry picked from commit d7730273)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4c1a1c23
    • Max Filippov's avatar
      xtensa: fix a6 and a7 handling in fast_syscall_xtensa · b87c951d
      Max Filippov authored
      Remove restoring a6 on some return paths and instead modify and restore
      it in a single place, using symbolic name.
      Correctly restore a7 from PT_AREG7 in case of illegal a6 value.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      
      (cherry picked from commit d1b6ba82)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b87c951d
    • Max Filippov's avatar
      xtensa: fix TLBTEMP_BASE_2 region handling in fast_second_level_miss · eb398c35
      Max Filippov authored
      Current definition of TLBTEMP_BASE_2 is always 32K above the
      TLBTEMP_BASE_1, whereas fast_second_level_miss handler for the TLBTEMP
      region analyzes virtual address bit (PAGE_SHIFT + DCACHE_ALIAS_ORDER)
      to determine TLBTEMP region where the fault happened. The size of the
      TLBTEMP region is also checked incorrectly: not 64K, but twice data
      cache way size (whicht may as well be less than the instruction cache
      way size).
      
      Fix TLBTEMP_BASE_2 to be TLBTEMP_BASE_1 + data cache way size.
      Provide TLBTEMP_SIZE that is a greater of doubled data cache way size or
      the instruction cache way size, and use it to determine if the second
      level TLB miss occured in the TLBTEMP region.
      
      Practical occurence of page faults in the TLBTEMP area is extremely
      rare, this code can be tested by deletion of all w[di]tlb instructions
      in the tlbtemp_mapping region.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      
      (cherry picked from commit 7128039f)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      eb398c35
    • Alan Douglas's avatar
      xtensa: fix address checks in dma_{alloc,free}_coherent · 523733da
      Alan Douglas authored
      Virtual address is translated to the XCHAL_KSEG_CACHED region in the
      dma_free_coherent, but is checked to be in the 0...XCHAL_KSEG_SIZE
      range.
      
      Change check for end of the range from 'addr >= X' to 'addr > X - 1' to
      handle the case of X == 0.
      
      Replace 'if (C) BUG();' construct with 'BUG_ON(C);'.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAlan Douglas <adouglas@cadence.com>
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      
      (cherry picked from commit 1ca49463)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      523733da
    • Arjun Sreedharan's avatar
      pata_scc: propagate return value of scc_wait_after_reset · bff54be9
      Arjun Sreedharan authored
      scc_bus_softreset not necessarily should return zero.
      Propagate the error code.
      Signed-off-by: default avatarArjun Sreedharan <arjun024@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      
      (cherry picked from commit 4dc7c76c)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      bff54be9
    • Anton Blanchard's avatar
      ibmveth: Fix endian issues with rx_no_buffer statistic · 4132d9dd
      Anton Blanchard authored
      Hidden away in the last 8 bytes of the buffer_list page is a solitary
      statistic. It needs to be byte swapped or else ethtool -S will
      produce numbers that terrify the user.
      
      Since we do this in multiple places, create a helper function with a
      comment explaining what is going on.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit cbd52281)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4132d9dd
    • David S. Miller's avatar
      sparc64: Do not insert non-valid PTEs into the TSB hash table. · cc241ba4
      David S. Miller authored
      The assumption was that update_mmu_cache() (and the equivalent for PMDs) would
      only be called when the PTE being installed will be accessible by the user.
      
      This is not true for code paths originating from remove_migration_pte().
      
      There are dire consequences for placing a non-valid PTE into the TSB.  The TLB
      miss frramework assumes thatwhen a TSB entry matches we can just load it into
      the TLB and return from the TLB miss trap.
      
      So if a non-valid PTE is in there, we will deadlock taking the TLB miss over
      and over, never satisfying the miss.
      
      Just exit early from update_mmu_cache() and friends in this situation.
      
      Based upon a report and patch from Christopher Alexander Tobias Schulze.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 18f38132)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      cc241ba4
    • H. Peter Anvin's avatar
      x86, espfix: Make espfix64 a Kconfig option, fix UML · 5d66c0c7
      H. Peter Anvin authored
      Make espfix64 a hidden Kconfig option.  This fixes the x86-64 UML
      build which had broken due to the non-existence of init_espfix_bsp()
      in UML: since UML uses its own Kconfig, this option does not appear in
      the UML build.
      
      This also makes it possible to make support for 16-bit segments a
      configuration option, for the people who want to minimize the size of
      the kernel.
      Reported-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Richard Weinberger <richard@nod.at>
      Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
      
      (cherry picked from commit 197725de)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5d66c0c7
    • Thomas Gleixner's avatar
      rtmutex: Fix deadlock detector for real · 42262f2c
      Thomas Gleixner authored
      The current deadlock detection logic does not work reliably due to the
      following early exit path:
      
      	/*
      	 * Drop out, when the task has no waiters. Note,
      	 * top_waiter can be NULL, when we are in the deboosting
      	 * mode!
      	 */
      	if (top_waiter && (!task_has_pi_waiters(task) ||
      			   top_waiter != task_top_pi_waiter(task)))
      		goto out_unlock_pi;
      
      So this not only exits when the task has no waiters, it also exits
      unconditionally when the current waiter is not the top priority waiter
      of the task.
      
      So in a nested locking scenario, it might abort the lock chain walk
      and therefor miss a potential deadlock.
      
      Simple fix: Continue the chain walk, when deadlock detection is
      enabled.
      
      We also avoid the whole enqueue, if we detect the deadlock right away
      (A-A). It's an optimization, but also prevents that another waiter who
      comes in after the detection and before the task has undone the damage
      observes the situation and detects the deadlock and returns
      -EDEADLOCK, which is wrong as the other task is not in a deadlock
      situation.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20140522031949.725272460@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      
      (cherry picked from commit 397335f0)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      42262f2c
    • Nadav Amit's avatar
      KVM: x86: Increase the number of fixed MTRR regs to 10 · 8a98f151
      Nadav Amit authored
      commit 682367c4 upstream.
      
      Recent Intel CPUs have 10 variable range MTRRs. Since operating systems
      sometime make assumptions on CPUs while they ignore capability MSRs, it is
      better for KVM to be consistent with recent CPUs. Reporting more MTRRs than
      actually supported has no functional implications.
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      (cherry picked from commit 80bbfbaa)
      8a98f151
    • Thomas Hellstrom's avatar
      drm/vmwgfx: Fix incorrect write to read-only register v2: · 4194290b
      Thomas Hellstrom authored
      Commit "drm/vmwgfx: correct fb_fix_screeninfo.line_length", while fixing a
      vmwgfx fbdev bug, also writes the pitch to a supposedly read-only register:
      SVGA_REG_BYTES_PER_LINE, while it should be (and also in fact is) written to
      SVGA_REG_PITCHLOCK.
      
      This patch is Cc'd stable because of the unknown effects writing to this
      register might have, particularly on older device versions.
      
      v2: Updated log message.
      
      Cc: stable@vger.kernel.org
      Cc: Christopher Friedt <chrisfriedt@gmail.com>
      Tested-by: default avatarChristopher Friedt <chrisfriedt@gmail.com>
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarJakob Bornecrantz <jakob@vmware.com>
      
      (cherry picked from commit 4e578080)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4194290b
    • Jaganath Kanakkassery's avatar
      Bluetooth: Fix invalid length check in l2cap_information_rsp() · 48818e14
      Jaganath Kanakkassery authored
      The length check is invalid since the length varies with type of
      info response.
      
      This was introduced by the commit cb3b3152
      
      Because of this, l2cap info rsp is not handled and command reject is sent.
      
      > ACL data: handle 11 flags 0x02 dlen 16
              L2CAP(s): Info rsp: type 2 result 0
                Extended feature mask 0x00b8
                  Enhanced Retransmission mode
                  Streaming mode
                  FCS Option
                  Fixed Channels
      < ACL data: handle 11 flags 0x00 dlen 10
              L2CAP(s): Command rej: reason 0
                Command not understood
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJaganath Kanakkassery <jaganath.k@samsung.com>
      Signed-off-by: default avatarChan-Yeol Park <chanyeol.park@samsung.com>
      Acked-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: default avatarGustavo Padovan <gustavo.padovan@collabora.co.uk>
      
      ath9k_htc: Handle IDLE state transition properly
      
      Make sure that a chip reset is done when IDLE is turned
      off - this fixes authentication timeouts.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarIgnacy Gawedzki <i@lri.fr>
      Signed-off-by: default avatarSujith Manoharan <c_manoha@qca.qualcomm.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      
      ath9k: fix an RCU issue in calling ieee80211_get_tx_rates
      
      ath_txq_schedule is called outside of the drv_tx call, so it needs RCU
      protection.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      
      Bluetooth: Fix invalid length check in l2cap_information_rsp()
      
      The length check is invalid since the length varies with type of
      info response.
      
      This was introduced by the commit cb3b3152
      
      Because of this, l2cap info rsp is not handled and command reject is sent.
      
      > ACL data: handle 11 flags 0x02 dlen 16
              L2CAP(s): Info rsp: type 2 result 0
                Extended feature mask 0x00b8
                  Enhanced Retransmission mode
                  Streaming mode
                  FCS Option
                  Fixed Channels
      < ACL data: handle 11 flags 0x00 dlen 10
              L2CAP(s): Command rej: reason 0
                Command not understood
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJaganath Kanakkassery <jaganath.k@samsung.com>
      Signed-off-by: default avatarChan-Yeol Park <chanyeol.park@samsung.com>
      Acked-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: default avatarGustavo Padovan <gustavo.padovan@collabora.co.uk>
      
      Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      nl80211: fix attrbuf access race by allocating a separate one
      
      Since my commit 3713b4e3 ("nl80211: allow splitting wiphy
      information in dumps"), nl80211_dump_wiphy() uses the global
      nl80211_fam.attrbuf for parsing the incoming data. This wouldn't
      be a problem if it only did so on the first dump iteration which
      is locked against other commands in generic netlink, but due to
      space constraints in cb->args (the needed state doesn't fit) I
      decided to always parse the original message. That's racy though
      since nl80211_fam.attrbuf could be used by some other parsing in
      generic netlink concurrently.
      
      For now, fix this by allocating a separate parse buffer (it's a
      bit too big for the stack, currently 1448 bytes on 64-bit). For
      -next, I'll change the code to parse into the global buffer in
      the first round only and then allocate a smaller buffer to keep
      the data in cb->args.
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Acked-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      
      (cherry picked from commit da9910ac
      3f6fa3d4)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      48818e14
    • Konrad Rzeszutek Wilk's avatar
      intel_idle: Don't register CPU notifier if we are not running. · 63c9b9e3
      Konrad Rzeszutek Wilk authored
      The 'intel_idle_probe' probes the CPU and sets the CPU notifier.
      But if later on during the module initialization we fail (say
      in cpuidle_register_driver), we stop loading, but we neglect
      to unregister the CPU notifier.  This means that during CPU
      hotplug events the system will fail:
      
      calling  intel_idle_init+0x0/0x326 @ 1
      intel_idle: MWAIT substates: 0x1120
      intel_idle: v0.4 model 0x2A
      intel_idle: lapic_timer_reliable_states 0xffffffff
      intel_idle: intel_idle yielding to none
      initcall intel_idle_init+0x0/0x326 returned -19 after 14 usecs
      
      ... some time later, offlining and onlining a CPU:
      
      cpu 3 spinlock event irq 62
      BUG: unable to ] __cpuidle_register_device+0x1c/0x120
      PGD 99b8b067 PUD 99b95067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in: xen_evtchn nouveau mxm_wmi wmi radeon ttm i915 fbcon tileblit font atl1c bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf
      CPU 0
      Pid: 2302, comm: udevd Not tainted 3.8.0-rc3upstream-00249-g09ad159 #1 MSI MS-7680/H61M-P23 (MS-7680)
      RIP: e030:[<ffffffff814d956c>]  [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120
      RSP: e02b:ffff88009dacfcb8  EFLAGS: 00010286
      RAX: 0000000000000000 RBX: ffff880105380000 RCX: 000000000000001c
      RDX: 0000000000000000 RSI: 0000000000000055 RDI: ffff880105380000
      RBP: ffff88009dacfce8 R08: ffffffff81a4f048 R09: 0000000000000008
      R10: 0000000000000008 R11: 0000000000000000 R12: ffff880105380000
      R13: 00000000ffffffdd R14: 0000000000000000 R15: ffffffff81a523d0
      FS:  00007f37bd83b7a0(0000) GS:ffff880105200000(0000) knlGS:0000000000000000
      CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 00000000a09ea000 CR4: 0000000000042660
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process udevd (pid: 2302, threadinfo ffff88009dace000, task ffff88009afb47f0)
      Stack:
       ffffffff8107f2d0 ffffffff810c2fb7 ffff88009dacfce8 00000000ffffffea
       ffff880105380000 00000000ffffffdd ffff88009dacfd08 ffffffff814d9882
       0000000000000003 ffff880105380000 ffff88009dacfd28 ffffffff81340afd
      Call Trace:
       [<ffffffff8107f2d0>] ? collect_cpu_info_local+0x30/0x30
       [<ffffffff810c2fb7>] ? __might_sleep+0xe7/0x100
       [<ffffffff814d9882>] cpuidle_register_device+0x32/0x70
       [<ffffffff81340afd>] intel_idle_cpu_init+0xad/0x110
       [<ffffffff81340bc8>] cpu_hotplug_notify+0x68/0x80
       [<ffffffff8166023d>] notifier_call_chain+0x4d/0x70
       [<ffffffff810bc369>] __raw_notifier_call_chain+0x9/0x10
       [<ffffffff81094a4b>] __cpu_notify+0x1b/0x30
       [<ffffffff81652cf7>] _cpu_up+0x103/0x14b
       [<ffffffff81652e18>] cpu_up+0xd9/0xec
       [<ffffffff8164a254>] store_online+0x94/0xd0
       [<ffffffff814122fb>] dev_attr_store+0x1b/0x20
       [<ffffffff81216404>] sysfs_write_file+0xf4/0x170
       [<ffffffff811a1024>] vfs_write+0xb4/0x130
       [<ffffffff811a17ea>] sys_write+0x5a/0xa0
       [<ffffffff816643a9>] system_call_fastpath+0x16/0x1b
      Code: 03 18 00 c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 30 48 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 e8 84 08 00 00 <48> 8b 78 08 49 89 c4 e8 f8 7f c1 ff 89 c2 b8 ea ff ff ff 84 d2
      RIP  [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120
       RSP <ffff88009dacfcb8>
      
      This patch fixes that by moving the CPU notifier registration
      as the last item to be done by the module.
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: 3.6+ <stable@vger.kernel.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      
      (cherry picked from commit 6f8c2e79)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      63c9b9e3
    • Florian Westphal's avatar
      net: ipv4: ip_forward: fix inverted local_df test · dd62a1f9
      Florian Westphal authored
      local_df means 'ignore DF bit if set', so if its set we're
      allowed to perform ip fragmentation.
      
      This wasn't noticed earlier because the output path also drops such skbs
      (and emits needed icmp error) and because netfilter ip defrag did not
      set local_df until couple of days ago.
      
      Only difference is that DF-packets-larger-than MTU now discarded
      earlier (f.e. we avoid pointless netfilter postrouting trip).
      
      While at it, drop the repeated test ip_exceeds_mtu, checking it once
      is enough...
      
      Fixes: fe6cc55f ("net: ip, ipv6: handle gso skbs in forwarding path")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit ca6c5d4a)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      dd62a1f9
    • Christopher Friedt's avatar
      drm/vmwgfx: correct fb_fix_screeninfo.line_length · 11b7678d
      Christopher Friedt authored
      commit aa6de142 upstream.
      
      Previously, the vmwgfx_fb driver would allow users to call FBIOSET_VINFO, but it would not adjust
      the FINFO properly, resulting in distorted screen rendering. The patch corrects that behaviour.
      
      See https://bugs.gentoo.org/show_bug.cgi?id=494794 for examples.
      Signed-off-by: default avatarChristopher Friedt <chrisfriedt@gmail.com>
      Reviewed-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      (cherry picked from commit b8a0ddef)
      11b7678d
    • Roger Quadros's avatar
      ARM: OMAP3: hwmod data: Correct clock domains for USB modules · ce3fd6ed
      Roger Quadros authored
      OMAP3 doesn't contain "l3_init_clkdm" clock domain. Use the
      proper clock domains for USB Host and USB TLL modules.
      
      Gets rid of the following warnings during boot
       omap_hwmod: usb_host_hs: could not associate to clkdm l3_init_clkdm
       omap_hwmod: usb_tll_hs: could not associate to clkdm l3_init_clkdm
      Reported-by: default avatarNishanth Menon <nm@ti.com>
      Cc: Paul Walmsley <paul@pwsan.com>
      Signed-off-by: default avatarRoger Quadros <rogerq@ti.com>
      Fixes: de231388 ("ARM: OMAP: USB: EHCI and OHCI hwmod structures for OMAP3")
      Cc: Keshava Munegowda <keshava_mgowda@ti.com>
      Cc: Partha Basak <parthab@india.ti.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      
      (cherry picked from commit c6c56697)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      ce3fd6ed
    • Nicholas Santos's avatar
      HID: usbhid: quirk for Formosa IR receiver · 1df94751
      Nicholas Santos authored
      Patch to add the Formosa Industrial Computing, Inc. Infrared Receiver
      [IR605A/Q] to hid-ids.h and hid-quirks.c.  This IR receiver causes about a 10
      second timeout when the usbhid driver attempts to initialze the device.  Adding
      this device to the quirks list with HID_QUIRK_NO_INIT_REPORTS removes the
      delay.
      Signed-off-by: default avatarNicholas Santos <nicholas.santos@gmail.com>
      [jkosina@suse.cz: fix ordering]
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      
      (cherry picked from commit 320cde19)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1df94751
    • Lai Jiangshan's avatar
      workqueue: ensure @task is valid across kthread_stop() · e68da712
      Lai Jiangshan authored
      When a kworker should die, the kworkre is notified through WORKER_DIE
      flag instead of kthread_should_stop().  This, IIRC, is primarily to
      keep the test synchronized inside worker_pool lock.  WORKER_DIE is
      first set while holding pool->lock, the lock is dropped and
      kthread_stop() is called.
      
      Unfortunately, this means that there's a slight chance that the target
      kworker may see WORKER_DIE before kthread_stop() finishes and exits
      and frees the target task before or during kthread_stop().
      
      Fix it by pinning the target task before setting WORKER_DIE and
      putting it after kthread_stop() is done.
      
      tj: Improved patch description and comment.  Moved pinning above
          WORKER_DIE for better signify what it's protecting.
      
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      
      (cherry picked from commit 5bdfff96)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e68da712
    • Jason Gunthorpe's avatar
      tpm: Provide a generic means to override the chip returned timeouts · 6b2a55ea
      Jason Gunthorpe authored
      Some Atmel TPMs provide completely wrong timeouts from their
      TPM_CAP_PROP_TIS_TIMEOUT query. This patch detects that and returns
      new correct values via a DID/VID table in the TIS driver.
      
      Tested on ARM using an AT97SC3204T FW version 37.16
      
      Cc: <stable@vger.kernel.org>
      [PHuewe: without this fix these 'broken' Atmel TPMs won't function on
      older kernels]
      Signed-off-by: default avatar"Berg, Christopher" <Christopher.Berg@atmel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      
      (cherry picked from commit 8e54caf4)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      6b2a55ea
    • Linus Torvalds's avatar
      vfs: avoid non-forwarding large load after small store in path lookup · 8aaa881c
      Linus Torvalds authored
      The performance regression that Josef Bacik reported in the pathname
      lookup (see commit 99d263d4 "vfs: fix bad hashing of dentries") made
      me look at performance stability of the dcache code, just to verify that
      the problem was actually fixed.  That turned up a few other problems in
      this area.
      
      There are a few cases where we exit RCU lookup mode and go to the slow
      serializing case when we shouldn't, Al has fixed those and they'll come
      in with the next VFS pull.
      
      But my performance verification also shows that link_path_walk() turns
      out to have a very unfortunate 32-bit store of the length and hash of
      the name we look up, followed by a 64-bit read of the combined hash_len
      field.  That screws up the processor store to load forwarding, causing
      an unnecessary hickup in this critical routine.
      
      It's caused by the ugly calling convention for the "hash_name()"
      function, and easily fixed by just making hash_name() fill in the whole
      'struct qstr' rather than passing it a pointer to just the hash value.
      
      With that, the profile for this function looks much smoother.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      
      Merge branch 'parisc-3.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
      
      Pull parisc updates from Helge Deller:
       "The most important patch is a new Light Weigth Syscall (LWS) for 8,
        16, 32 and 64 bit atomic CAS operations which is required in order to
        be able to implement the atomic gcc builtins on our platform.
      
        Other than that, we wire up the seccomp, getrandom and memfd_create
        syscalls, fixes a minor off-by-one bug and a wrong printk string"
      
      * 'parisc-3.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Implement new LWS CAS supporting 64 bit operations.
        parisc: Wire up seccomp, getrandom and memfd_create syscalls
        parisc: dino: fix %d confusingly prefixed with 0x in format string
        parisc: sys_hpux: NUL terminator is one past the end
      
      Merge tag 'ntb-3.17' of git://github.com/jonmason/ntb
      
      Pull ntb driver bugfixes from Jon Mason:
       "NTB driver fixes for queue spread and buffer alignment.  Also, update
        to MAINTAINERS to reflect new e-mail address"
      
      * tag 'ntb-3.17' of git://github.com/jonmason/ntb:
        ntb: Add alignment check to meet hardware requirement
        MAINTAINERS: update NTB info
        NTB: correct the spread of queues over mw's
      
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
      
      Pull ARM irq chip fixes from Thomas Gleixner:
       "Another pile of ARM specific irq chip fixlets:
      
         - off by one bugs in the crossbar driver
         - missing annotations
         - a bunch of "make it compile" updates
      
        I pulled the lot today from Jason, but it has been in -next for at
        least a week"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip: gic-v3: Declare rdist as __percpu pointer to __iomem pointer
        irqchip: gic: Make gic_default_routable_irq_domain_ops static
        irqchip: exynos-combiner: Fix compilation error on ARM64
        irqchip: crossbar: Off by one bugs in init
        irqchip: gic-v3: Tag all low level accessors __maybe_unused
        irqchip: gic-v3: Only define gic_peek_irq() when building SMP
      
      Merge tag 'irqchip-urgent-3.17' of git://git.infradead.org/users/jcooper/linux into irq/urgent
      
      irqchip fixes for v3.17 from Jason Cooper
      
       - GIC/GICV3: Various fixlets
       - crossbar: Fix off-by-one bug
       - exynos-combiner: Fix arm64 build error
      
      ntb: Add alignment check to meet hardware requirement
      
      The NTB translate register must have the value to be BAR size aligned.
      This alignment check make sure that the DMA memory allocated has the
      proper alignment. Another requirement for NTB to function properly with
      memory window BAR size greater or equal to 4M is to use the CMA feature
      in 3.16 kernel with the appropriate CONFIG_CMA_ALIGNMENT and
      CONFIG_CMA_SIZE_MBYTES set.
      Signed-off-by: default avatarDave Jiang <dave.jiang@intel.com>
      Signed-off-by: default avatarJon Mason <jdmason@kudzu.us>
      
      MAINTAINERS: update NTB info
      
      Update my contact info to my personal email address and add Dave Jiang.
      Signed-off-by: default avatarJon Mason <jon.mason@intel.com>
      Signed-off-by: default avatarDave Jiang <dave.jiang@intel.com>
      
      NTB: correct the spread of queues over mw's
      
      The detection of an uneven number of queues on the given memory windows
      was not correct.  The mw_num is zero based and the mod should be
      division to spread them evenly over the mw's.
      Signed-off-by: default avatarJon Mason <jon.mason@intel.com>
      
      Merge branches 'locking-urgent-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
      
      Pull futex and timer fixes from Thomas Gleixner:
       "A oneliner bugfix for the jinxed futex code:
      
         - Drop hash bucket lock in the error exit path.  I really could slap
           myself for intruducing that bug while fixing all the other horror
           in that code three month ago ...
      
        and the timer department is not too proud about the following fixes:
      
         - Deal with a long standing rounding bug in the timeval to jiffies
           conversion.  It's a real issue and this fix fell through the cracks
           for quite some time.
      
         - Another round of alarmtimer fixes.  Finally this code gets used
           more widely and the subtle issues hidden for quite some time are
           noticed and fixed.  Nothing really exciting, just the itty bitty
           details which bite the serious users here and there"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        futex: Unlock hb->lock in futex_wait_requeue_pi() error path
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        alarmtimer: Lock k_itimer during timer callback
        alarmtimer: Do not signal SIGEV_NONE timers
        alarmtimer: Return relative times in timer_gettime
        jiffies: Fix timeval conversion to jiffies
      
      parisc: Implement new LWS CAS supporting 64 bit operations.
      
      The current LWS cas only works correctly for 32bit. The new LWS allows
      for CAS operations of variable size.
      Signed-off-by: default avatarGuy Martin <gmsoft@tuxicoman.be>
      Cc: <stable@vger.kernel.org> # 3.13+
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      
      vfs: fix bad hashing of dentries
      
      Josef Bacik found a performance regression between 3.2 and 3.10 and
      narrowed it down to commit bfcfaa77 ("vfs: use 'unsigned long'
      accesses for dcache name comparison and hashing"). He reports:
      
       "The test case is essentially
      
            for (i = 0; i < 1000000; i++)
                    mkdir("a$i");
      
        On xfs on a fio card this goes at about 20k dir/sec with 3.2, and 12k
        dir/sec with 3.10.  This is because we spend waaaaay more time in
        __d_lookup on 3.10 than in 3.2.
      
        The new hashing function for strings is suboptimal for <
        sizeof(unsigned long) string names (and hell even > sizeof(unsigned
        long) string names that I've tested).  I broke out the old hashing
        function and the new one into a userspace helper to get real numbers
        and this is what I'm getting:
      
            Old hash table had 1000000 entries, 0 dupes, 0 max dupes
            New hash table had 12628 entries, 987372 dupes, 900 max dupes
            We had 11400 buckets with a p50 of 30 dupes, p90 of 240 dupes, p99 of 567 dupes for the new hash
      
        My test does the hash, and then does the d_hash into a integer pointer
        array the same size as the dentry hash table on my system, and then
        just increments the value at the address we got to see how many
        entries we overlap with.
      
        As you can see the old hash function ended up with all 1 million
        entries in their own bucket, whereas the new one they are only
        distributed among ~12.5k buckets, which is why we're using so much
        more CPU in __d_lookup".
      
      The reason for this hash regression is two-fold:
      
       - On 64-bit architectures the down-mixing of the original 64-bit
         word-at-a-time hash into the final 32-bit hash value is very
         simplistic and suboptimal, and just adds the two 32-bit parts
         together.
      
         In particular, because there is no bit shuffling and the mixing
         boundary is also a byte boundary, similar character patterns in the
         low and high word easily end up just canceling each other out.
      
       - the old byte-at-a-time hash mixed each byte into the final hash as it
         hashed the path component name, resulting in the low bits of the hash
         generally being a good source of hash data.  That is not true for the
         word-at-a-time case, and the hash data is distributed among all the
         bits.
      
      The fix is the same in both cases: do a better job of mixing the bits up
      and using as much of the hash data as possible.  We already have the
      "hash_32|64()" functions to do that.
      Reported-by: default avatarJosef Bacik <jbacik@fb.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      
      alarmtimer: Lock k_itimer during timer callback
      
      Locks the k_itimer's it_lock member when handling the alarm timer's
      expiry callback.
      
      The regular posix timers defined in posix-timers.c have this lock held
      during timout processing because their callbacks are routed through
      posix_timer_fn().  The alarm timers follow a different path, so they
      ought to grab the lock somewhere else.
      
      Cc: stable@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sharvil Nanavati <sharvil@google.com>
      Signed-off-by: default avatarRichard Larocque <rlarocque@google.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      
      alarmtimer: Do not signal SIGEV_NONE timers
      
      Avoids sending a signal to alarm timers created with sigev_notify set to
      SIGEV_NONE by checking for that special case in the timeout callback.
      
      The regular posix timers avoid sending signals to SIGEV_NONE timers by
      not scheduling any callbacks for them in the first place.  Although it
      would be possible to do something similar for alarm timers, it's simpler
      to handle this as a special case in the timeout.
      
      Prior to this patch, the alarm timer would ignore the sigev_notify value
      and try to deliver signals to the process anyway.  Even worse, the
      sanity check for the value of sigev_signo is skipped when SIGEV_NONE was
      specified, so the signal number could be bogus.  If sigev_signo was an
      unitialized value (as it often would be if SIGEV_NONE is used), then
      it's hard to predict which signal will be sent.
      
      Cc: stable@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sharvil Nanavati <sharvil@google.com>
      Signed-off-by: default avatarRichard Larocque <rlarocque@google.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      
      alarmtimer: Return relative times in timer_gettime
      
      Returns the time remaining for an alarm timer, rather than the time at
      which it is scheduled to expire.  If the timer has already expired or it
      is not currently scheduled, the it_value's members are set to zero.
      
      This new behavior matches that of the other posix-timers and the POSIX
      specifications.
      
      This is a change in user-visible behavior, and may break existing
      applications.  Hopefully, few users rely on the old incorrect behavior.
      
      Cc: stable@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sharvil Nanavati <sharvil@google.com>
      Signed-off-by: default avatarRichard Larocque <rlarocque@google.com>
      [jstultz: minor style tweak]
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      
      jiffies: Fix timeval conversion to jiffies
      
      timeval_to_jiffies tried to round a timeval up to an integral number
      of jiffies, but the logic for doing so was incorrect: intervals
      corresponding to exactly N jiffies would become N+1. This manifested
      itself particularly repeatedly stopping/starting an itimer:
      
      setitimer(ITIMER_PROF, &val, NULL);
      setitimer(ITIMER_PROF, NULL, &val);
      
      would add a full tick to val, _even if it was exactly representable in
      terms of jiffies_ (say, the result of a previous rounding.)  Doing
      this repeatedly would cause unbounded growth in val.  So fix the math.
      
      Here's what was wrong with the conversion: we essentially computed
      (eliding seconds)
      
      jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)
      
      by using scaling arithmetic, which took the best approximation of
      NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
      x/(2^USEC_JIFFIE_SC), and computed:
      
      jiffies = (usec * x) >> USEC_JIFFIE_SC
      
      and rounded this calculation up in the intermediate form (since we
      can't necessarily exactly represent TICK_NSEC in usec.) But the
      scaling arithmetic is a (very slight) *over*approximation of the true
      value; that is, instead of dividing by (1 usec/ 1 jiffie), we
      effectively divided by (1 usec/1 jiffie)-epsilon (rounding
      down). This would normally be fine, but we want to round timeouts up,
      and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
      would be fine if our division was exact, but dividing this by the
      slightly smaller factor was equivalent to adding just _over_ 1 to the
      final result (instead of just _under_ 1, as desired.)
      
      In particular, with HZ=1000, we consistently computed that 10000 usec
      was 11 jiffies; the same was true for any exact multiple of
      TICK_NSEC.
      
      We could possibly still round in the intermediate form, adding
      something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
      convert usec->nsec, round in nanoseconds, and then convert using
      time*spec*_to_jiffies.  This adds one constant multiplication, and is
      not observably slower in microbenchmarks on recent x86 hardware.
      
      Tested: the following program:
      
      int main() {
        struct itimerval zero = {{0, 0}, {0, 0}};
        /* Initially set to 10 ms. */
        struct itimerval initial = zero;
        initial.it_interval.tv_usec = 10000;
        setitimer(ITIMER_PROF, &initial, NULL);
        /* Save and restore several times. */
        for (size_t i = 0; i < 10; ++i) {
          struct itimerval prev;
          setitimer(ITIMER_PROF, &zero, &prev);
          /* on old kernels, this goes up by TICK_USEC every iteration */
          printf("previous value: %ld %ld %ld %ld\n",
                 prev.it_interval.tv_sec, prev.it_interval.tv_usec,
                 prev.it_value.tv_sec, prev.it_value.tv_usec);
          setitimer(ITIMER_PROF, &prev, NULL);
        }
          return 0;
      }
      
      Cc: stable@vger.kernel.org
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Reviewed-by: default avatarPaul Turner <pjt@google.com>
      Reported-by: default avatarAaron Jacobs <jacobsa@google.com>
      Signed-off-by: default avatarAndrew Hunter <ahh@google.com>
      [jstultz: Tweaked to apply to 3.17-rc]
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      
      futex: Unlock hb->lock in futex_wait_requeue_pi() error path
      
      futex_wait_requeue_pi() calls futex_wait_setup(). If
      futex_wait_setup() succeeds it returns with hb->lock held and
      preemption disabled. Now the sanity check after this does:
      
              if (match_futex(&q.key, &key2)) {
      	   	ret = -EINVAL;
      		goto out_put_keys;
      	}
      
      which releases the keys but does not release hb->lock.
      
      So we happily return to user space with hb->lock held and therefor
      preemption disabled.
      
      Unlock hb->lock before taking the exit route.
      Reported-by: default avatarDave "Trinity" Jones <davej@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDarren Hart <dvhart@linux.intel.com>
      Reviewed-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1409112318500.4178@nanosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      
      irqchip: gic-v3: Declare rdist as __percpu pointer to __iomem pointer
      
      The __percpu __iomem annotations on the rdist base are contradictory
      and confuse static checkers such as sparse.
      
      This patch fixes the anotations so that rdist is described as a __percpu
      pointer to an __iomem pointer.
      
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Link: https://lkml.kernel.org/r/1409062410-25891-9-git-send-email-will.deacon@arm.comSigned-off-by: default avatarJason Cooper <jason@lakedaemon.net>
      
      irqchip: gic: Make gic_default_routable_irq_domain_ops static
      
      The internal irq domain ops for the GIC are not used directly anywhere
      else, so make them static. This gets rid of a sparse warning on the
      file.
      
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Link: https://lkml.kernel.org/r/1409062410-25891-8-git-send-email-will.deacon@arm.comSigned-off-by: default avatarJason Cooper <jason@lakedaemon.net>
      
      irqchip: exynos-combiner: Fix compilation error on ARM64
      
      The following compilation error occurs on 64-bit Exynos7 SoC:
      
      drivers/irqchip/exynos-combiner.c: In function ‘combiner_irq_domain_map’:
      drivers/irqchip/exynos-combiner.c:162:2: error: implicit declaration of function ‘set_irq_flags’ [-Werror=implicit-function-declaration]
        set_irq_flags(irq, IRQF_VALID | IRQF_PROBE);
        ^
      drivers/irqchip/exynos-combiner.c:162:21: error: ‘IRQF_VALID’ undeclared (first use in this function)
        set_irq_flags(irq, IRQF_VALID | IRQF_PROBE);
                           ^
      drivers/irqchip/exynos-combiner.c:162:21: note: each undeclared identifier is reported only once for each function it appears in
      drivers/irqchip/exynos-combiner.c:162:34: error: ‘IRQF_PROBE’ undeclared (first use in this function)
        set_irq_flags(irq, IRQF_VALID | IRQF_PROBE);
      
      Fix the build error by including linux/interrupt.h.
      Signed-off-by: default avatarNaveen Krishna Chatradhi <ch.naveen@samsung.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Link: https://lkml.kernel.org/r/1409722329-18309-1-git-send-email-ch.naveen@samsung.comSigned-off-by: default avatarJason Cooper <jason@lakedaemon.net>
      
      parisc: Wire up seccomp, getrandom and memfd_create syscalls
      
      With secure computing we only support the SECCOMP_MODE_STRICT mode for
      now.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      
      parisc: dino: fix %d confusingly prefixed with 0x in format string
      Signed-off-by: default avatarHans Wennborg <hans@hanshq.net>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      
      parisc: sys_hpux: NUL terminator is one past the end
      
      We allocate "len" number of chars so we should put the NUL at "len - 1"
      to avoid corrupting memory.  Btw, strlen_user() is different from the
      normal strlen() function because it includes NUL terminator in the
      count.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      
      irqchip: crossbar: Off by one bugs in init
      
      My static checker complains that the ">" should be ">=" or else we go
      beyond the end of the cb->irq_map[] array on the next line.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarJason Cooper <jason@lakedaemon.net>
      
      irqchip: gic-v3: Tag all low level accessors __maybe_unused
      
      This is only really needed for gic_write_sgi1r in the !SMP case since it
      is only referenced in the SMP initialisation code but it seems better to
      have these functions all next to each other and declared consistently.
      Signed-off-by: default avatarMark Brown <broonie@linaro.org>
      Link: https://lkml.kernel.org/r/1406748194-21094-1-git-send-email-broonie@kernel.orgSigned-off-by: default avatarJason Cooper <jason@lakedaemon.net>
      
      irqchip: gic-v3: Only define gic_peek_irq() when building SMP
      
      If building with CONFIG_SMP disbled (for example, with allnoconfig) then
      GCC complains that the static function gic_peek_irq() is defined but not
      used since the only reference is in the SMP initialisation code. Fix this
      by moving the function definition inside the ifdef.
      Signed-off-by: default avatarMark Brown <broonie@linaro.org>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Link: https://lkml.kernel.org/r/1406480224-24628-1-git-send-email-broonie@kernel.orgSigned-off-by: default avatarJason Cooper <jason@lakedaemon.net>
      
      (cherry picked from commit 9226b5b4
      99d263d4)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      8aaa881c
    • Al Viro's avatar
      dcache.c: get rid of pointless macros · 145fce8d
      Al Viro authored
      D_HASH{MASK,BITS} are used once each, both in the same function (d_hash()).
      At this point they are actively misguiding - they imply that values are
      compiler constants, which is no longer true.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      
      (cherry picked from commit 482db906)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      145fce8d
    • Tejun Heo's avatar
      blkcg: don't call into policy draining if root_blkg is already gone · 3f2c76f9
      Tejun Heo authored
      While a queue is being destroyed, all the blkgs are destroyed and its
      ->root_blkg pointer is set to NULL.  If someone else starts to drain
      while the queue is in this state, the following oops happens.
      
        NULL pointer dereference at 0000000000000028
        IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
        PGD e4a1067 PUD b773067 PMD 0
        Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
        Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched]
        CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000
        RIP: 0010:[<ffffffff8144e944>]  [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
        RSP: 0018:ffff88000efd7bf0  EFLAGS: 00010046
        RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001
        RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
        RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001
        R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450
        R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28
        FS:  00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0
        Stack:
         ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80
         ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58
         ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450
        Call Trace:
         [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60
         [<ffffffff81427641>] __blk_drain_queue+0x71/0x180
         [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0
         [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120
         [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50
         [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40
         [<ffffffff8142d476>] blk_release_queue+0x26/0xd0
         [<ffffffff81454968>] kobject_cleanup+0x38/0x70
         [<ffffffff81454848>] kobject_put+0x28/0x60
         [<ffffffff81427505>] blk_put_queue+0x15/0x20
         [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0
         [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0
         [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20
         [<ffffffff817930e2>] device_release+0x32/0xa0
         [<ffffffff81454968>] kobject_cleanup+0x38/0x70
         [<ffffffff81454848>] kobject_put+0x28/0x60
         [<ffffffff817934d7>] put_device+0x17/0x20
         [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0
         [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40
         [<ffffffff817d1257>] sdev_store_delete+0x27/0x30
         [<ffffffff81792ca8>] dev_attr_store+0x18/0x30
         [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50
         [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170
         [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0
         [<ffffffff811f69bd>] SyS_write+0x4d/0xc0
         [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b
      
      776687bc ("block, blk-mq: draining can't be skipped even if
      bypass_depth was non-zero") made it easier to trigger this bug by
      making blk_queue_bypass_start() drain even when it loses the first
      bypass test to blk_cleanup_queue(); however, the bug has always been
      there even before the commit as blk_queue_bypass_start() could race
      against queue destruction, win the initial bypass test but perform the
      actual draining after blk_cleanup_queue() already destroyed all blkgs.
      
      Fix it by skippping calling into policy draining if all the blkgs are
      already gone.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarShirish Pargaonkar <spargaonkar@suse.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Reported-by: default avatarJet Chen <jet.chen@intel.com>
      Cc: stable@vger.kernel.org
      Tested-by: default avatarShirish Pargaonkar <spargaonkar@suse.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      Revert "bio: modify __bio_add_page() to accept pages that don't start a new segment"
      
      This reverts commit 254c4407.
      
      It causes crashes with cryptsetup, even after a few iterations and
      updates. Drop it for now.
      
      blkcg: don't call into policy draining if root_blkg is already gone
      
      While a queue is being destroyed, all the blkgs are destroyed and its
      ->root_blkg pointer is set to NULL.  If someone else starts to drain
      while the queue is in this state, the following oops happens.
      
        NULL pointer dereference at 0000000000000028
        IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
        PGD e4a1067 PUD b773067 PMD 0
        Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
        Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched]
        CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000
        RIP: 0010:[<ffffffff8144e944>]  [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
        RSP: 0018:ffff88000efd7bf0  EFLAGS: 00010046
        RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001
        RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
        RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001
        R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450
        R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28
        FS:  00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0
        Stack:
         ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80
         ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58
         ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450
        Call Trace:
         [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60
         [<ffffffff81427641>] __blk_drain_queue+0x71/0x180
         [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0
         [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120
         [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50
         [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40
         [<ffffffff8142d476>] blk_release_queue+0x26/0xd0
         [<ffffffff81454968>] kobject_cleanup+0x38/0x70
         [<ffffffff81454848>] kobject_put+0x28/0x60
         [<ffffffff81427505>] blk_put_queue+0x15/0x20
         [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0
         [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0
         [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20
         [<ffffffff817930e2>] device_release+0x32/0xa0
         [<ffffffff81454968>] kobject_cleanup+0x38/0x70
         [<ffffffff81454848>] kobject_put+0x28/0x60
         [<ffffffff817934d7>] put_device+0x17/0x20
         [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0
         [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40
         [<ffffffff817d1257>] sdev_store_delete+0x27/0x30
         [<ffffffff81792ca8>] dev_attr_store+0x18/0x30
         [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50
         [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170
         [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0
         [<ffffffff811f69bd>] SyS_write+0x4d/0xc0
         [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b
      
      776687bc ("block, blk-mq: draining can't be skipped even if
      bypass_depth was non-zero") made it easier to trigger this bug by
      making blk_queue_bypass_start() drain even when it loses the first
      bypass test to blk_cleanup_queue(); however, the bug has always been
      there even before the commit as blk_queue_bypass_start() could race
      against queue destruction, win the initial bypass test but perform the
      actual draining after blk_cleanup_queue() already destroyed all blkgs.
      
      Fix it by skippping calling into policy draining if all the blkgs are
      already gone.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarShirish Pargaonkar <spargaonkar@suse.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Reported-by: default avatarJet Chen <jet.chen@intel.com>
      Cc: stable@vger.kernel.org
      Tested-by: default avatarShirish Pargaonkar <spargaonkar@suse.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      bio: modify __bio_add_page() to accept pages that don't start a new segment
      
      The original behaviour is to refuse to add a new page if the maximum
      number of segments has been reached, regardless of the fact the page we
      are going to add can be merged into the last segment or not.
      
      Unfortunately, when the system runs under heavy memory fragmentation
      conditions, a driver may try to add multiple pages to the last segment.
      The original code won't accept them and EBUSY will be reported to
      userspace.
      
      This patch modifies the function so it refuses to add a page only in case
      the latter starts a new segment and the maximum number of segments has
      already been reached.
      
      The bug can be easily reproduced with the st driver:
      
      1) set CONFIG_SCSI_MPT2SAS_MAX_SGE or CONFIG_SCSI_MPT3SAS_MAX_SGE  to 16
      2) modprobe st buffer_kbs=1024
      3) #dd if=/dev/zero of=/dev/st0 bs=1M count=10
         dd: error writing `/dev/st0': Device or resource busy
      
      [ming.lei@canonical.com: update bi_iter.bi_size before recounting segments]
      Signed-off-by: default avatarMaurizio Lombardi <mlombard@redhat.com>
      Signed-off-by: default avatarMing Lei <ming.lei@canonical.com>
      Tested-by: default avatarDongsu Park <dongsu.park@profitbricks.com>
      Tested-by: default avatarJet Chen <jet.chen@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      block: fix SG_[GS]ET_RESERVED_SIZE ioctl when max_sectors is huge
      
      SG_GET_RESERVED_SIZE and SG_SET_RESERVED_SIZE ioctls access a reserved
      buffer in bytes as int type.  The value needs to be capped at the request
      queue's max_sectors.  But integer overflow is not correctly handled in
      the calculation when converting max_sectors from sectors to bytes.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: linux-scsi@vger.kernel.org
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      block: fix BLKSECTGET ioctl when max_sectors is greater than USHRT_MAX
      
      BLKSECTGET ioctl loads the request queue's max_sectors as unsigned
      short value to the argument pointer.  So if the max_sector is greater
      than USHRT_MAX, the upper 16 bits of that is just discarded.
      
      In such case, USHRT_MAX is more preferable than the lower 16 bits of
      max_sectors.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: linux-scsi@vger.kernel.org
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      block/partitions/efi.c: kerneldoc fixing
      
      Adding function documentation and fixing kerneldoc warnings
      ('field: description' uniformization).
      
      Cc: Davidlohr Bueso <davidlohr@hp.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      block/partitions/msdos.c: code clean-up
      
      checkpatch fixing:
      WARNING: Missing a blank line after declarations
      WARNING: space prohibited between function name and open parenthesis '('
      ERROR: spaces required around that '<' (ctx:VxV)
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      block/partitions/amiga.c: replace nolevel printk by pr_err
      
      Also add no prefix pr_fmt to avoid any future default format update
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      block/partitions/aix.c: replace count*size kzalloc by kcalloc
      
      kcalloc manages count*sizeof overflow.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      bio-integrity: add "bip_max_vcnt" into struct bio_integrity_payload
      
      Commit 08778795 ("block: Fix nr_vecs for inline integrity vectors") from
      Martin introduces the function bip_integrity_vecs(get the useful vectors)
      to fix the issue about nr_vecs for inline integrity vectors that reported
      by David Milburn.
      
      But it seems that bip_integrity_vecs() will return the wrong number if the
      bio is not based on any bio_set for some reason(bio->bi_pool == NULL),
      because in that case, the bip_inline_vecs[0] is malloced directly.  So
      here we add the bip_max_vcnt to record the count of vector slots, and
      cleanup the function bip_integrity_vecs().
      Signed-off-by: default avatarGu Zheng <guz.fnst@cn.fujitsu.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      blk-mq: use percpu_ref for mq usage count
      
      Currently, blk-mq uses a percpu_counter to keep track of how many
      usages are in flight.  The percpu_counter is drained while freezing to
      ensure that no usage is left in-flight after freezing is complete.
      blk_mq_queue_enter/exit() and blk_mq_[un]freeze_queue() implement this
      per-cpu gating mechanism.
      
      This type of code has relatively high chance of subtle bugs which are
      extremely difficult to trigger and it's way too hairy to be open coded
      in blk-mq.  percpu_ref can serve the same purpose after the recent
      changes.  This patch replaces the open-coded per-cpu usage counting
      and draining mechanism with percpu_ref.
      
      blk_mq_queue_enter() performs tryget_live on the ref and exit()
      performs put.  blk_mq_freeze_queue() kills the ref and waits until the
      reference count reaches zero.  blk_mq_unfreeze_queue() revives the ref
      and wakes up the waiters.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      blk-mq: collapse __blk_mq_drain_queue() into blk_mq_freeze_queue()
      
      Keeping __blk_mq_drain_queue() as a separate function doesn't buy us
      anything and it's gonna be further simplified.  Let's flatten it into
      its caller.
      
      This patch doesn't make any functional change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      blk-mq: decouble blk-mq freezing from generic bypassing
      
      blk_mq freezing is entangled with generic bypassing which bypasses
      blkcg and io scheduler and lets IO requests fall through the block
      layer to the drivers in FIFO order.  This allows forward progress on
      IOs with the advanced features disabled so that those features can be
      configured or altered without worrying about stalling IO which may
      lead to deadlock through memory allocation.
      
      However, generic bypassing doesn't quite fit blk-mq.  blk-mq currently
      doesn't make use of blkcg or ioscheds and it maps bypssing to
      freezing, which blocks request processing and drains all the in-flight
      ones.  This causes problems as bypassing assumes that request
      processing is online.  blk-mq works around this by conditionally
      allowing request processing for the problem case - during queue
      initialization.
      
      Another weirdity is that except for during queue cleanup, bypassing
      started on the generic side prevents blk-mq from processing new
      requests but doesn't drain the in-flight ones.  This shouldn't break
      anything but again highlights that something isn't quite right here.
      
      The root cause is conflating blk-mq freezing and generic bypassing
      which are two different mechanisms.  The only intersecting purpose
      that they serve is during queue cleanup.  Let's properly separate
      blk-mq freezing from generic bypassing and simply use it where
      necessary.
      
      * request_queue->mq_freeze_depth is added and
        blk_mq_[un]freeze_queue() now operate on this counter instead of
        ->bypass_depth.  The replacement for QUEUE_FLAG_BYPASS isn't added
        but the counter is tested directly.  This will be further updated by
        later changes.
      
      * blk_mq_drain_queue() is dropped and "__" prefix is dropped from
        blk_mq_freeze_queue().  Queue cleanup path now calls
        blk_mq_freeze_queue() directly.
      
      * blk_queue_enter()'s fast path condition is simplified to simply
        check @q->mq_freeze_depth.  Previously, the condition was
      
      	!blk_queue_dying(q) &&
      	    (!blk_queue_bypass(q) || !blk_queue_init_done(q))
      
        mq_freeze_depth is incremented right after dying is set and
        blk_queue_init_done() exception isn't necessary as blk-mq doesn't
        start frozen, which only leaves the blk_queue_bypass() test which
        can be replaced by @q->mq_freeze_depth test.
      
      This change simplifies the code and reduces confusion in the area.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      block, blk-mq: draining can't be skipped even if bypass_depth was non-zero
      
      Currently, both blk_queue_bypass_start() and blk_mq_freeze_queue()
      skip queue draining if bypass_depth was already above zero.  The
      assumption is that the one which bumped the bypass_depth should have
      performed draining already; however, there's nothing which prevents a
      new instance of bypassing/freezing from starting before the previous
      one finishes draining.  The current code may allow the later
      bypassing/freezing instances to complete while there still are
      in-flight requests which haven't finished draining.
      
      Fix it by draining regardless of bypass_depth.  We still skip draining
      from blk_queue_bypass_start() while the queue is initializing to avoid
      introducing excessive delays during boot.  INIT_DONE setting is moved
      above the initial blk_queue_bypass_end() so that bypassing attempts
      can't slip inbetween.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      blk-mq: fix a memory ordering bug in blk_mq_queue_enter()
      
      blk-mq uses a percpu_counter to keep track of how many usages are in
      flight.  The percpu_counter is drained while freezing to ensure that
      no usage is left in-flight after freezing is complete.
      
      blk_mq_queue_enter/exit() and blk_mq_[un]freeze_queue() implement this
      per-cpu gating mechanism; unfortunately, it contains a subtle bug -
      smp_wmb() in blk_mq_queue_enter() doesn't prevent prevent the cpu from
      fetching @q->bypass_depth before incrementing @q->mq_usage_counter and
      if freezing happens inbetween the caller can slip through and freezing
      can be complete while there are active users.
      
      Use smp_mb() instead so that bypass_depth and mq_usage_counter
      modifications and tests are properly interlocked.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      
      Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu into for-3.17/core
      
      Merge the percpu_ref changes from Tejun, he says they are stable now.
      
      percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero()
      
      Now that explicit invocation of percpu_ref_exit() is necessary to free
      the percpu counter, we can implement percpu_ref_reinit() which
      reinitializes a released percpu_ref.  This can be used implement
      scalable gating switch which can be drained and then re-opened without
      worrying about memory allocation failures.
      
      percpu_ref_is_zero() is added to be used in a sanity check in
      percpu_ref_exit().  As this function will be useful for other purposes
      too, make it a public interface.
      
      v2: Use smp_read_barrier_depends() instead of smp_load_acquire().  We
          only need data dep barrier and smp_load_acquire() is stronger and
          heavier on some archs.  Spotted by Lai Jiangshan.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      
      percpu-refcount: require percpu_ref to be exited explicitly
      
      Currently, a percpu_ref undoes percpu_ref_init() automatically by
      freeing the allocated percpu area when the percpu_ref is killed.
      While seemingly convenient, this has the following niggles.
      
      * It's impossible to re-init a released reference counter without
        going through re-allocation.
      
      * In the similar vein, it's impossible to initialize a percpu_ref
        count with static percpu variables.
      
      * We need and have an explicit destructor anyway for failure paths -
        percpu_ref_cancel_init().
      
      This patch removes the automatic percpu counter freeing in
      percpu_ref_kill_rcu() and repurposes percpu_ref_cancel_init() into a
      generic destructor now named percpu_ref_exit().  percpu_ref_destroy()
      is considered but it gets confusing with percpu_ref_kill() while
      "exit" clearly indicates that it's the counterpart of
      percpu_ref_init().
      
      All percpu_ref_cancel_init() users are updated to invoke
      percpu_ref_exit() instead and explicit percpu_ref_exit() calls are
      added to the destruction path of all percpu_ref users.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarBenjamin LaHaise <bcrl@kvack.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Benjamin LaHaise <bcrl@kvack.org>
      Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
      Cc: Li Zefan <lizefan@huawei.com>
      
      percpu-refcount: use unsigned long for pcpu_count pointer
      
      percpu_ref->pcpu_count is a percpu pointer with a status flag in its
      lowest bit.  As such, it always goes through arithmetic operations
      which is very cumbersome to do on a pointer.  It has to be first
      casted to unsigned long and then back.
      
      Let's just make the field unsigned long so that we can skip the first
      casts.  While at it, rename it to pcpu_counter_ptr to clarify that
      it's a pointer value.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      
      percpu-refcount: add helpers for ->percpu_count accesses
      
      * All four percpu_ref_*() operations implemented in the header file
        perform the same operation to determine whether the percpu_ref is
        alive and extract the percpu pointer.  Factor out the common logic
        into __pcpu_ref_alive().  This doesn't change the generated code.
      
      * There are a couple places in percpu-refcount.c which masks out
        PCPU_REF_DEAD to obtain the percpu pointer.  Factor it out into
        pcpu_count_ptr().
      
      * The above changes make the WARN_ON_ONCE() conditional at the top of
        percpu_ref_kill_and_confirm() the only user of REF_STATUS().  Test
        PCPU_REF_DEAD directly and remove REF_STATUS().
      
      This patch doesn't introduce any functional change.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      
      percpu-refcount: one bit is enough for REF_STATUS
      
      percpu-refcount currently reserves two lowest bits of its percpu
      pointer to indicate its state; however, only one bit is used for
      PCPU_REF_DEAD.
      
      Simplify it by removing PCPU_STATUS_BITS/MASK and testing
      PCPU_REF_DEAD directly.  This also allows the compiler to choose a
      more efficient instruction depending on the architecture.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      
      percpu-refcount, aio: use percpu_ref_cancel_init() in ioctx_alloc()
      
      ioctx_alloc() reaches inside percpu_ref and directly frees
      ->pcpu_count in its failure path, which is quite gross.  percpu_ref
      has been providing a proper interface to do this,
      percpu_ref_cancel_init(), for quite some time now.  Let's use that
      instead.
      
      This patch doesn't introduce any behavior changes.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarBenjamin LaHaise <bcrl@kvack.org>
      Cc: Kent Overstreet <kmo@daterainc.com>
      
      workqueue: stronger test in process_one_work()
      
      After the recent changes, when POOL_DISASSOCIATED is cleared, the
      running worker's local CPU should be the same as pool->cpu without any
      exception even during cpu-hotplug.  Update the sanity check in
      process_one_work() accordingly.
      
      This patch changes "(proposition_A && proposition_B && proposition_C)"
      to "(proposition_B && proposition_C)", so if the old compound
      proposition is true, the new one must be true too. so this will not
      hide any possible bug which can be caught by the old test.
      
      tj: Minor updates to the description.
      
      CC: Jason J. Herne <jjherne@linux.vnet.ibm.com>
      CC: Sasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      
      workqueue: clear POOL_DISASSOCIATED in rebind_workers()
      
      The commit a9ab775b ("workqueue: directly restore CPU affinity of
      workers from CPU_ONLINE") moved the pool->lock into rebind_workers()
      without also moving "pool->flags &= ~POOL_DISASSOCIATED".
      
      There is nothing wrong with "pool->flags &= ~POOL_DISASSOCIATED" not
      being moved together, but there isn't any benefit either. We move it
      into rebind_workers() and achieve these benefits:
      
      1) Better readability.  POOL_DISASSOCIATED is cleared in
         rebind_workers() as expected.
      
      2) When POOL_DISASSOCIATED is cleared, we can ensure that all the
         running workers of the pool are on the local CPU (pool->cpu).
      
      tj: Cosmetic updates to the code and description.
      Signed-off-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      
      percpu: Use ALIGN macro instead of hand coding alignment calculation
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      
      percpu: invoke __verify_pcpu_ptr() from the generic part of accessors and operations
      
      __verify_pcpu_ptr() is used to verify that a specified parameter is
      actually an percpu pointer by percpu accessor and operation
      implementations.  Currently, where it's called isn't clearly defined
      and we just ensure that it's invoked at least once for all accessors
      and operations.
      
      The lack of clarity on when it should be called isn't nice and given
      that this is a completely generic issue, there's no reason to make
      archs worry about it.
      
      This patch updates __verify_pcpu_ptr() invocations such that it's
      always invoked from the final generic wrapper once per access or
      operation.  As this is already the case for {raw|this}_cpu_*()
      definitions through __pcpu_size_*(), only the {raw|per|this}_cpu_ptr()
      accessors need to be updated.
      
      This change makes it unnecessary for archs to worry about
      __verify_pcpu_ptr().  x86's arch_raw_cpu_ptr() is updated accordingly.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      
      percpu: preffity percpu header files
      
      percpu macros are difficult to read.  It's partly because they're
      fairly complex but also because they simply lack visual and
      conventional consistency to an unusual degree.  The preceding patches
      tried to organize macro definitions consistently by their roles.  This
      patch makes the following cosmetic changes to improve overall
      readability.
      
      * Use consistent convention for multi-line macro definitions - "do {"
        or "({" are now put on their own lines and the line continuing '\'
        are all put on the same column.
      
      * Temp variables used inside macro are consistently given "__" prefix.
      
      * When a macro argument is passed to another macro or a function,
        putting extra parenthses around it doesn't help anything.  Don't put
        them.
      
      * _this_cpu_generic_*() are renamed to this_cpu_generic_*() so that
        they're consistent with raw_cpu_generic_*().
      
      * Reorganize raw_cpu_*() and this_cpu_*() definitions so that trivial
        wrappers are collected in one place after actual operation
        definitions.
      
      * Other misc cleanups including reorganizing comments.
      
      All changes in this patch are cosmetic and cause no functional
      difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      
      percpu: use raw_cpu_*() to define __this_cpu_*()
      
      __this_cpu_*() operations are the same as raw_cpu_*() operations
      except for the added __this_cpu_preempt_check().  Curiously, these
      were defined using __pcu_size_call_*() instead of being layered on top
      of raw_cpu_*().
      
      Let's layer them so that __this_cpu_*() are defined in terms of
      raw_cpu_*().  It's simpler and less error-prone this way.
      
      This patch doesn't introduce any functional difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      
      percpu: reorder macros in percpu header files
      
      * In include/asm-generic/percpu.h, collect {raw|_this}_cpu_generic*()
        macros into one place.  They were dispersed through
        {raw|this}_cpu_*_N() definitions and the visiual inconsistency was
        making following the code unnecessarily difficult.
      
      * In include/linux/percpu-defs.h, move __verify_pcpu_ptr() later in
        the file so that it's right above accessor definitions where it's
        actually used.
      
      This is pure reorganization.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      
      percpu: move {raw|this}_cpu_*() definitions to include/linux/percpu-defs.h
      
      We're in the process of moving all percpu accessors and operations to
      include/linux/percpu-defs.h so that they're available to arch headers
      without having to include full include/linux/percpu.h which may cause
      cyclic inclusion dependency.
      
      This patch moves {raw|this}_cpu_*() definitions from
      include/linux/percpu.h to include/linux/percpu-defs.h.  The code is
      moved mostly verbatim; however, raw_cpu_*() are placed above
      this_cpu_*() which is more conventional as the raw operations may be
      used to defined other variants.
      
      This is pure reorganization.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      
      percpu: move generic {raw|this}_cpu_*_N() definitions to include/asm-generic/percpu.h
      
      {raw|this}_cpu_*_N() operations are expected to be provided by archs
      and the generic definitions are provided as fallbacks.  As such, these
      firmly belong to include/asm-generic/percpu.h.
      
      Move the generic definitions to include/asm-generic/percpu.h.  The
      code is moved mostly verbatim; however, raw_cpu_*_N() are placed above
      this_cpu_*_N() which is more conventional as the raw operations may be
      used to defined other variants.
      
      This is pure reorganization.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      
      percpu: only allow sized arch overrides for {raw|this}_cpu_*() ops
      
      Currently, percpu allows two separate methods for overriding
      {raw|this}_cpu_*() ops - for a given operation, an arch can provide
      whole replacement or sized sub operations to override specific parts
      of it.  e.g. arch either can provide this_cpu_add() or
      this_cpu_add_4() to override only the 4 byte operation.
      
      While quite flexible on a glance, the dual-overriding scheme
      complicates the code path for no actual gain.  It compilcates the
      already complex operation definitions and if an arch wants to override
      all sizes, it can easily provide all variants anyway.  In fact, no
      arch is actually making use of whole operation override.
      
      Another oddity is that __this_cpu_*() operations are defined in the
      same way as raw_cpu_*() but ignores full overrides of the raw_cpu_*()
      and doesn't allow full operation override, so if an arch provides
      whole overrides for raw_cpu_*() operations __this_cpu_*() ends up
      using the generic implementations.
      
      More importantly, it takes away the layering between arch-specific and
      generic parts making it impossible for the generic part to implement
      arch-independent features on top of arch-specific overrides.
      
      This patch removes the support for whole operation overrides.  As no
      arch is using it, this doesn't cause any actual difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      
      percpu: reorganize include/linux/percpu-defs.h
      
      Reorganize for better readability.
      
      * Accessor definitions are collected into one place and SMP and UP now
        define them in the same order.
      
      * Definitions are layered when possible - e.g. per_cpu() is now
        defined in terms of this_cpu_ptr().
      
      * Rather pointless comment dropped.
      
      * per_cpu(), __raw_get_cpu_var() and __get_cpu_var() are defined in a
        way which can be shared between SMP and UP and moved out of
        CONFIG_SMP blocks.
      
      This patch doesn't introduce any functional difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      
      percpu: move accessors from include/linux/percpu.h to percpu-defs.h
      
      include/linux/percpu-defs.h is gonna host all accessors and operations
      so that arch headers can make use of them too without worrying about
      circular dependency through include/linux/percpu.h.
      
      This patch moves the following accessors from include/linux/percpu.h
      to include/linux/percpu-defs.h.
      
      * get/put_cpu_var()
      * get/put_cpu_ptr()
      * per_cpu_ptr()
      
      This is pure reorgniazation.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      
      percpu: include/asm-generic/percpu.h should contain only arch-overridable parts
      
      The roles of the various percpu header files has become unclear.
      There are four header files involved.
      
       include/linux/percpu-defs.h
       include/linux/percpu.h
       include/asm-generic/percpu.h
       arch/*/include/asm/percpu.h
      
      The original intention for include/asm-generic/percpu.h is providing
      generic definitions for arch-overridable parts; however, it now hosts
      various stuff which can't be overridden by archs.
      
      Also, include/linux/percpu-defs.h was initially added to contain
      section and percpu variable definition macros so that arch header
      files can make use of them without worrying about introducing cyclic
      inclusion dependency by including include/linux/percpu.h; however,
      arch headers sometimes need to access percpu variables too and this is
      one of the reasons why some accessors were implemented in
      include/linux/asm-generic/percpu.h.
      
      Let's clear up the situation by making include/asm-generic/percpu.h
      contain only arch-overridable parts and moving accessors and
      operations into include/linux/percpu-defs.  Note that this patch only
      moves things from include/asm-generic/percpu.h.
      include/linux/percpu.h will be taken care of by later patches.
      
      This patch moves the followings.
      
      * SHIFT_PERCPU_PTR() / VERIFY_PERCPU_PTR()
      * per_cpu()
      * raw_cpu_ptr()
      * this_cpu_ptr()
      * __get_cpu_var()
      * __raw_get_cpu_var()
      * __this_cpu_ptr()
      * PER_CPU_[SHARED_]ALIGNED_SECTION
      * PER_CPU_[SHARED_]ALIGNED_SECTION
      * PER_CPU_FIRST_SECTION
      
      This patch is pure reorganization.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      
      percpu: introduce arch_raw_cpu_ptr()
      
      Currently, archs can override raw_cpu_ptr() directly; however, we
      wanna build a layer of indirection in the generic part of percpu so
      that we can implement generic features there without affecting archs.
      
      Introduce arch_raw_cpu_ptr() which is used to define raw_cpu_ptr() by
      generic percpu code.  The two are identical for now.  x86 is currently
      the only arch which overrides raw_cpu_ptr() and is converted to
      define arch_raw_cpu_ptr() instead.
      
      This doesn't introduce any functional difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      
      percpu: disallow archs from overriding SHIFT_PERCPU_PTR()
      
      It has been about half a decade since all archs started using the
      dynamic percpu allocator and thus the same SHIFT_PERCPU_PTR()
      implementation.  There's no benefit in overriding SHIFT_PERCPU_PTR()
      anymore.
      
      Remove #ifndef around it to clarify that this is identical regardless
      of the arch.
      
      This patch doesn't cause any functional difference.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      
      (cherry picked from commit 2a1b4cf2
      0b462c89)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      3f2c76f9
    • NeilBrown's avatar
      md/raid1,raid10: always abort recover on write error. · 49213bcb
      NeilBrown authored
      Currently we don't abort recovery on a write error if the write error
      to the recovering device was triggerd by normal IO (as opposed to
      recovery IO).
      
      This means that for one bitmap region, the recovery might write to the
      recovering device for a few sectors, then not bother for subsequent
      sectors (as it never writes to failed devices).  In this case
      the bitmap bit will be cleared, but it really shouldn't.
      
      The result is that if the recovering device fails and is then re-added
      (after fixing whatever hardware problem triggerred the failure),
      the second recovery won't redo the region it was in the middle of,
      so some of the device will not be recovered properly.
      
      If we abort the recovery, the region being processes will be cancelled
      (bit not cleared) and the whole region will be retried.
      
      As the bug can result in data corruption the patch is suitable for
      -stable.  For kernels prior to 3.11 there is a conflict in raid10.c
      which will require care.
      
      Original-from: jiao hui <jiaohui@bwstor.com.cn>
      Reported-and-tested-by: default avatarjiao hui <jiaohui@bwstor.com.cn>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Cc: stable@vger.kernel.org
      
      (cherry picked from commit 2446dba0)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      49213bcb
    • Eric W. Biederman's avatar
      mnt: Correct permission checks in do_remount · 5061831c
      Eric W. Biederman authored
      While invesgiating the issue where in "mount --bind -oremount,ro ..."
      would result in later "mount --bind -oremount,rw" succeeding even if
      the mount started off locked I realized that there are several
      additional mount flags that should be locked and are not.
      
      In particular MNT_NOSUID, MNT_NODEV, MNT_NOEXEC, and the atime
      flags in addition to MNT_READONLY should all be locked.  These
      flags are all per superblock, can all be changed with MS_BIND,
      and should not be changable if set by a more privileged user.
      
      The following additions to the current logic are added in this patch.
      - nosuid may not be clearable by a less privileged user.
      - nodev  may not be clearable by a less privielged user.
      - noexec may not be clearable by a less privileged user.
      - atime flags may not be changeable by a less privileged user.
      
      The logic with atime is that always setting atime on access is a
      global policy and backup software and auditing software could break if
      atime bits are not updated (when they are configured to be updated),
      and serious performance degradation could result (DOS attack) if atime
      updates happen when they have been explicitly disabled.  Therefore an
      unprivileged user should not be able to mess with the atime bits set
      by a more privileged user.
      
      The additional restrictions are implemented with the addition of
      MNT_LOCK_NOSUID, MNT_LOCK_NODEV, MNT_LOCK_NOEXEC, and MNT_LOCK_ATIME
      mnt flags.
      
      Taken together these changes and the fixes for MNT_LOCK_READONLY
      should make it safe for an unprivileged user to create a user
      namespace and to call "mount --bind -o remount,... ..." without
      the danger of mount flags being changed maliciously.
      
      Cc: stable@vger.kernel.org
      Acked-by: default avatarSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      
      (cherry picked from commit 9566d674)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5061831c
    • Eric W. Biederman's avatar
      mnt: Only change user settable mount flags in remount · 6fca9e95
      Eric W. Biederman authored
      Kenton Varda <kenton@sandstorm.io> discovered that by remounting a
      read-only bind mount read-only in a user namespace the
      MNT_LOCK_READONLY bit would be cleared, allowing an unprivileged user
      to the remount a read-only mount read-write.
      
      Correct this by replacing the mask of mount flags to preserve
      with a mask of mount flags that may be changed, and preserve
      all others.   This ensures that any future bugs with this mask and
      remount will fail in an easy to detect way where new mount flags
      simply won't change.
      
      Cc: stable@vger.kernel.org
      Acked-by: default avatarSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      
      (cherry picked from commit a6138db8)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      6fca9e95
    • Ralf Baechle's avatar
      MIPS: Fix accessing to per-cpu data when flushing the cache · b8a9e7ba
      Ralf Baechle authored
      This fixes the following issue
      
      BUG: using smp_processor_id() in preemptible [00000000] code: kjournald/1761
      caller is blast_dcache32+0x30/0x254
      Call Trace:
      [<8047f02c>] dump_stack+0x8/0x34
      [<802e7e40>] debug_smp_processor_id+0xe0/0xf0
      [<80114d94>] blast_dcache32+0x30/0x254
      [<80118484>] r4k_dma_cache_wback_inv+0x200/0x288
      [<80110ff0>] mips_dma_map_sg+0x108/0x180
      [<80355098>] ide_dma_prepare+0xf0/0x1b8
      [<8034eaa4>] do_rw_taskfile+0x1e8/0x33c
      [<8035951c>] ide_do_rw_disk+0x298/0x3e4
      [<8034a3c4>] do_ide_request+0x2e0/0x704
      [<802bb0dc>] __blk_run_queue+0x44/0x64
      [<802be000>] queue_unplugged.isra.36+0x1c/0x54
      [<802beb94>] blk_flush_plug_list+0x18c/0x24c
      [<802bec6c>] blk_finish_plug+0x18/0x48
      [<8026554c>] journal_commit_transaction+0x3b8/0x151c
      [<80269648>] kjournald+0xec/0x238
      [<8014ac00>] kthread+0xb8/0xc0
      [<8010268c>] ret_from_kernel_thread+0x14/0x1c
      
      Caches in most systems are identical - but not always, so we can't avoid
      the use of smp_call_function() by just looking at the boot CPU's data,
      have to fiddle with preemption instead.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/5835
      
      (cherry picked from commit ff522058)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b8a9e7ba
    • Aaro Koskinen's avatar
      MIPS: OCTEON: make get_system_type() thread-safe · 43ed8029
      Aaro Koskinen authored
      get_system_type() is not thread-safe on OCTEON. It uses static data,
      also more dangerous issue is that it's calling cvmx_fuse_read_byte()
      every time without any synchronization. Currently it's possible to get
      processes stuck looping forever in kernel simply by launching multiple
      readers of /proc/cpuinfo:
      
      	(while true; do cat /proc/cpuinfo > /dev/null; done) &
      	(while true; do cat /proc/cpuinfo > /dev/null; done) &
      	...
      
      Fix by initializing the system type string only once during the early
      boot.
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@nsn.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Patchwork: http://patchwork.linux-mips.org/patch/7437/Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      
      MIPS: CPS: Initialize EVA before bringing up VPEs from secondary cores
      
      The CPS code is doing several memory loads when configuring the VPEs
      from secondary cores, so the segmentation control registers must be
      initialized in time otherwise the kernel will crash with strange
      TLB exceptions.
      Reviewed-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Patchwork: http://patchwork.linux-mips.org/patch/7424/Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      
      MIPS: Malta: EVA: Rename 'eva_entry' to 'platform_eva_init'
      
      Rename 'eva_entry' to 'platform_eva_init' as required by the new
      'eva_init' macro in the eva.h header. Since this macro is now used
      in a platform dependent way, it must not depend on its caller so move
      the t1 register initialization inside this macro. Also set the .reorder
      assembler option in case the caller may have previously set .noreorder.
      This may allow a few assembler optimizations. Finally include missing
      headers and document the register usage for this macro.
      Reviewed-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Patchwork: http://patchwork.linux-mips.org/patch/7423/Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      
      MIPS: EVA: Add new EVA header
      
      Generic code may need to perform certain operations when EVA is
      enabled, for example, configure the segmentation registers during
      boot. In order to avoid using more CONFIG_EVA ifdefs in the arch code,
      such functions will be added in this header instead.
      Initially this header contains a macro which will be used by generic
      code later on during VPEs configuration on secondary cores.
      All it does is to call the platform specific EVA init code in case
      EVA is enabled.
      Reviewed-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Patchwork: http://patchwork.linux-mips.org/patch/7422/Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      
      MIPS: scall64-o32: Fix indirect syscall detection
      
      Commit 4c21b8fd (MIPS: seccomp: Handle indirect system calls (o32))
      added indirect syscall detection for O32 processes running on MIPS64
      but it did not work as expected. The reason is the the scall64-o32
      implementation differs compared to scall32-o32. In the former, the v0
      (syscall number) register contains the absolute syscall number
      (4000 + X) whereas in the latter it contains the relative syscall
      number (X). Fix the code to avoid doing an extra addition, and load
      the v0 register directly to the first argument for syscall_trace_enter.
      Moreover, set the .reorder assembler option in order to have better
      control on this part of the assembly code.
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Patchwork: http://patchwork.linux-mips.org/patch/7481/
      Cc: <stable@vger.kernel.org> # v3.15+
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      
      MIPS: syscall: Fix AUDIT value for O32 processes on MIPS64
      
      On MIPS64, O32 processes set both TIF_32BIT_ADDR and
      TIF_32BIT_REGS so the previous condition treated O32 applications
      as N32 when evaluating seccomp filters. Fix the condition to check
      both TIF_32BIT_{REGS, ADDR} for the N32 AUDIT flag.
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Patchwork: http://patchwork.linux-mips.org/patch/7480/
      Cc: <stable@vger.kernel.org> # v3.15+
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      
      MIPS: Loongson: Fix COP2 usage for preemptible kernel
      
      In preemptible kernel, only TIF_USEDFPU flag is reliable to distinguish
      whether _init_fpu()/_restore_fp() is needed. Because the value of the
      CP0_Status.CU1 isn't changed during preemption.
      
      V2: Fix coding style.
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Cc: John Crispin <john@phrozen.org>
      Cc: Steven J. Hill <Steven.Hill@imgtec.com>
      Cc: Aurelien Jarno <aurelien@aurel32.net>
      Cc: linux-mips@linux-mips.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Patchwork: https://patchwork.linux-mips.org/patch/7515/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: NL: Fix nlm_xlp_defconfig build error
      
      The nlm_xlp_defconfig build fails with
      
      ./arch/mips/include/asm/mach-netlogic/topology.h:15:0:
      			error: "topology_core_id" redefined [-Werror]
      In file included from include/linux/smp.h:59:0,
      	[ ...]
                       from arch/mips/mm/dma-default.c:12:
      ./arch/mips/include/asm/smp.h:41:0:
      			note: this is the location of the previous definition
      
      and similar errors.
      
      This is caused by commit bda4584c ("MIPS: Support CPU topology files
      in sysfs") which adds the defines to arch/mips/include/asm/smp.h.
      
      Remove the defines from arch/mips/include/asm/mach-netlogic/topology.h
      as no longer necessary.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Andreas Herrmann <andreas.herrmann@caviumnetworks.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/7513/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: Remove race window in page fault handling
      
      Multicore MIPSes without I/D hardware coherency suffered from a race
      condition in the page fault handler. The page table entry was
      published before any pending lazy D-cache flush was committed, hence
      it allowed execution of stale page cache data by other VPEs in the
      system.
      
      To make the cache handling safe we need to perform flushing already in
      the set_pte_at function. MIPSes without coherent I-caches can get a
      small increase in flushes due to the unavailability of the execute
      flag in set_pte_at.
      
      [ralf@linux-mips.org: outlining set_pte_at() saves a good k in a test
      build, so I moved its definition from pgtable.h to cache.c.]
      Signed-off-by: default avatarLars Persson <larper@axis.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/7511/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: Malta: Improve system memory detection for '{e, }memsize' >= 2G
      
      Using kstrtol to parse the "{e,}memsize" variables was wrong because this
      parses signed long numbers. In case of '{e,}memsize' >= 2G, the top bit
      is set, resulting to -ERANGE errors and possibly random system memory
      boundaries. We fix this by replacing "kstrtol" with "kstrtoul".
      We also improve the code to check the kstrtoul return value and
      print a warning if an error was returned.
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Cc: <stable@vger.kernel.org> # v3.15+
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/7543/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: Alchemy: Fix db1200 PSC clock enablement
      
      Enable PSC0 (I2C/SPI) clock and leave PSC1 (Audio) alone.  This patch
      restores functionality to both Audio and I2C/SPI.
      Signed-off-by: default avatarManuel Lauss <manuel.lauss@gmail.com>
      Cc: Linux-MIPS <linux-mips@linux-mips.org>
      Patchwork: https://patchwork.linux-mips.org/patch/7544/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: BCM47XX: Fix reboot problem on BCM4705/BCM4785
      
      This adds some code based on code from the Broadcom GPL tar to fix the
      reboot problems on BCM4705/BCM4785. I tried rebooting my device for ~10
      times and have never seen a problem. This reverts the changes in the
      previous commit and adds the real fix as suggested by Rafał.
      
      Setting bit 22 in Reg 22, sel 4 puts the BIU (Bus Interface Unit) into
      async mode.
      
      The previous commit was 316cad5c [MIPS:
      BCM47XX: make reboot more relaiable]
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Cc: jogo@openwrt.org
      Cc: zajec5@gmail.com
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/7545/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: Remove duplicated include from numa.c
      Signed-off-by: default avatarWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/7537/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: Add common plat_irq_dispatch declaration
      
      Add common declaration to get rid of following sparse warning: "symbol
      'plat_irq_dispatch' was not declared. Should it be static?"
      Signed-off-by: default avatarSergey Ryazanov <ryazanov.s.a@gmail.com>
      Cc: Linux MIPS <linux-mips@linux-mips.org>
      Patchwork: https://patchwork.linux-mips.org/patch/7539/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: MSP71xx: remove unused plat_irq_dispatch() argument
      
      Remove unused argument to make the plat_irq_dispatch() function
      declaration similar to the realization of other platforms.
      Signed-off-by: default avatarSergey Ryazanov <ryazanov.s.a@gmail.com>
      Cc: Linux MIPS <linux-mips@linux-mips.org>
      Patchwork: https://patchwork.linux-mips.org/patch/7538/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: GIC: Remove useless parens from GICBIS().
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: perf: Mark pmu interupt IRQF_NO_THREAD
      
      In RT kernel, I ran into the following calltrace, so PMU interrupts cannot
      be threaded
      
      in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/0
      INFO: lockdep is turned off.
      Call Trace:
      [<ffffffff8088595c>] dump_stack+0x1c/0x50
      [<ffffffff801a958c>] __might_sleep+0x13c/0x148
      [<ffffffff80891c54>] rt_spin_lock+0x3c/0xb0
      [<ffffffff801ad29c>] __wake_up+0x3c/0x80
      [<ffffffff80243ba4>] perf_event_wakeup+0x8c/0xf8
      [<ffffffff80243c50>] perf_pending_event+0x40/0x78
      [<ffffffff8023d88c>] irq_work_run+0x74/0xc0
      [<ffffffff80152640>] mipsxx_pmu_handle_shared_irq+0x110/0x228
      [<ffffffff8015276c>] mipsxx_pmu_handle_irq+0x14/0x30
      [<ffffffff801ffda4>] handle_irq_event_percpu+0xbc/0x470
      [<ffffffff80204478>] handle_percpu_irq+0x98/0xc8
      [<ffffffff801ff284>] generic_handle_irq+0x4c/0x68
      [<ffffffff8089748c>] do_IRQ+0x2c/0x48
      [<ffffffff80105864>] plat_irq_dispatch+0x64/0xd0
      
      [ralf@linux-mips.org: I don't see why based on this register dump the
      handler should be marked IRQF_NO_THREAD - but the handler is manipulating
      per-CPU resources so we don't want it to be rescheduled to another CPU.]
      Signed-off-by: default avatarYang Wei <Wei.Yang@windriver.com>
      Cc: a.p.zijlstra@chello.nl
      Cc: paulus@samba.org
      Cc: mingo@redhat.com
      Cc: acme@kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/7506/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: kdump: Set correct value to kexec_indirection_page variable
      
      Since there is not indirection page in crash type, so the vaule of the head
      field of kimage structure is not equal to the address of indirection page but
      IND_DONE. so we have to set kexec_indirection_page variable to the address of
      the head field of image structure.
      
      [ralf@linux-mips.org: Don't add pointless empty line, fix trailing
      whitespace damage.]
      Signed-off-by: default avatarYang Wei <Wei.Yang@windriver.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/7499/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      
      MIPS: OCTEON: make get_system_type() thread-safe
      
      get_system_type() is not thread-safe on OCTEON. It uses static data,
      also more dangerous issue is that it's calling cvmx_fuse_read_byte()
      every time without any synchronization. Currently it's possible to get
      processes stuck looping forever in kernel simply by launching multiple
      readers of /proc/cpuinfo:
      
      	(while true; do cat /proc/cpuinfo > /dev/null; done) &
      	(while true; do cat /proc/cpuinfo > /dev/null; done) &
      	...
      
      Fix by initializing the system type string only once during the early
      boot.
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@nsn.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Patchwork: http://patchwork.linux-mips.org/patch/7437/Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      
      (cherry picked from commit 33d9a530
      60830868)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      43ed8029
    • Jarkko Sakkinen's avatar
      tpm: missing tpm_chip_put in tpm_get_random() · 2273ecb5
      Jarkko Sakkinen authored
      Regression in 41ab999c. Call to tpm_chip_put is missing. This
      will cause TPM device driver not to unload if tmp_get_random()
      is called.
      
      Cc: <stable@vger.kernel.org> # 3.7+
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      
      (cherry picked from commit 3e14d83e)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      2273ecb5
    • Paolo Bonzini's avatar
      Revert "KVM: x86: Increase the number of fixed MTRR regs to 10" · 19249df2
      Paolo Bonzini authored
      This reverts commit 682367c4,
      which causes 32-bit SMP Windows 7 guests to panic.
      
      SeaBIOS has a limit on the number of MTRRs that it can handle,
      and this patch exceeded the limit.  Better revert it.
      Thanks to Nadav Amit for debugging the cause.
      
      Cc: stable@nongnu.org
      Reported-by: default avatarWanpeng Li <wanpeng.li@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      
      (cherry picked from commit 0d234daf)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      19249df2