1. 25 Jan, 2014 8 commits
    • Toshi Kani's avatar
      arch/x86/mm/srat: Skip NUMA_NO_NODE while parsing SLIT · a85eba88
      Toshi Kani authored
      When ACPI SLIT table has an I/O locality (i.e. a locality
      unique to an I/O device), numa_set_distance() emits this warning
      message:
      
       NUMA: Warning: node ids are out of bound, from=-1 to=-1 distance=10
      
      acpi_numa_slit_init() calls numa_set_distance() with
      pxm_to_node(), which assumes that all localities have been
      parsed with SRAT previously.  SRAT does not list I/O localities,
      where as SLIT lists all localities including I/Os.  Hence,
      pxm_to_node() returns NUMA_NO_NODE (-1) for an I/O locality.
      
      I/O localities are not supported and are ignored today, but emitting
      such warning message leads to unnecessary confusion.
      
      Change acpi_numa_slit_init() to avoid calling
      numa_set_distance() with NUMA_NO_NODE.
      Signed-off-by: default avatarToshi Kani <toshi.kani@hp.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/n/tip-dSvpjjvp8aMzs1ybkftxohlh@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a85eba88
    • Mel Gorman's avatar
      mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge · b9a3b4c9
      Mel Gorman authored
      There was a large ebizzy performance regression that was
      bisected to commit 611ae8e3 (x86/tlb: enable tlb flush range
      support for x86).  The problem was related to the
      tlb_flushall_shift tuning for IvyBridge which was altered.  The
      problem is that it is not clear if the tuning values for each
      CPU family is correct as the methodology used to tune the values
      is unclear.
      
      This patch uses a conservative tlb_flushall_shift value for all
      CPU families except IvyBridge so the decision can be revisited
      if any regression is found as a result of this change.
      IvyBridge is an exception as testing with one methodology
      determined that the value of 2 is acceptable.  Details are in
      the changelog for the patch "x86: mm: Change tlb_flushall_shift
      for IvyBridge".
      
      One important aspect of this to watch out for is Xen.  The
      original commit log mentioned large performance gains on Xen.
      It's possible Xen is more sensitive to this value if it flushes
      small ranges of pages more frequently than workloads on bare
      metal typically do.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Tested-by: default avatarDavidlohr Bueso <davidlohr@hp.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Alex Shi <alex.shi@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-dyzMww3fqugnhbhgo6Gxmtkw@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b9a3b4c9
    • Mel Gorman's avatar
      x86: mm: change tlb_flushall_shift for IvyBridge · f98b7a77
      Mel Gorman authored
      There was a large performance regression that was bisected to
      commit 611ae8e3 ("x86/tlb: enable tlb flush range support for
      x86").  This patch simply changes the default balance point
      between a local and global flush for IvyBridge.
      
      In the interest of allowing the tests to be reproduced, this
      patch was tested using mmtests 0.15 with the following
      configurations
      
      	configs/config-global-dhp__tlbflush-performance
      	configs/config-global-dhp__scheduler-performance
      	configs/config-global-dhp__network-performance
      
      Results are from two machines
      
      Ivybridge   4 threads:  Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz
      Ivybridge   8 threads:  Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
      
      Page fault microbenchmark showed nothing interesting.
      
      Ebizzy was configured to run multiple iterations and threads.
      Thread counts ranged from 1 to NR_CPUS*2. For each thread count,
      it ran 100 iterations and each iteration lasted 10 seconds.
      
      Ivybridge 4 threads
                          3.13.0-rc7            3.13.0-rc7
                             vanilla           altshift-v3
      Mean   1     6395.44 (  0.00%)     6789.09 (  6.16%)
      Mean   2     7012.85 (  0.00%)     8052.16 ( 14.82%)
      Mean   3     6403.04 (  0.00%)     6973.74 (  8.91%)
      Mean   4     6135.32 (  0.00%)     6582.33 (  7.29%)
      Mean   5     6095.69 (  0.00%)     6526.68 (  7.07%)
      Mean   6     6114.33 (  0.00%)     6416.64 (  4.94%)
      Mean   7     6085.10 (  0.00%)     6448.51 (  5.97%)
      Mean   8     6120.62 (  0.00%)     6462.97 (  5.59%)
      
      Ivybridge 8 threads
                           3.13.0-rc7            3.13.0-rc7
                              vanilla           altshift-v3
      Mean   1      7336.65 (  0.00%)     7787.02 (  6.14%)
      Mean   2      8218.41 (  0.00%)     9484.13 ( 15.40%)
      Mean   3      7973.62 (  0.00%)     8922.01 ( 11.89%)
      Mean   4      7798.33 (  0.00%)     8567.03 (  9.86%)
      Mean   5      7158.72 (  0.00%)     8214.23 ( 14.74%)
      Mean   6      6852.27 (  0.00%)     7952.45 ( 16.06%)
      Mean   7      6774.65 (  0.00%)     7536.35 ( 11.24%)
      Mean   8      6510.50 (  0.00%)     6894.05 (  5.89%)
      Mean   12     6182.90 (  0.00%)     6661.29 (  7.74%)
      Mean   16     6100.09 (  0.00%)     6608.69 (  8.34%)
      
      Ebizzy hits the worst case scenario for TLB range flushing every
      time and it shows for these Ivybridge CPUs at least that the
      default choice is a poor on.  The patch addresses the problem.
      
      Next was a tlbflush microbenchmark written by Alex Shi at
      http://marc.info/?l=linux-kernel&m=133727348217113 .  It
      measures access costs while the TLB is being flushed.  The
      expectation is that if there are always full TLB flushes that
      the benchmark would suffer and it benefits from range flushing
      
      There are 320 iterations of the test per thread count.  The
      number of entries is randomly selected with a min of 1 and max
      of 512.  To ensure a reasonably even spread of entries, the full
      range is broken up into 8 sections and a random number selected
      within that section.
      
      iteration 1, random number between 0-64
      iteration 2, random number between 64-128 etc
      
      This is still a very weak methodology.  When you do not know
      what are typical ranges, random is a reasonable choice but it
      can be easily argued that the opimisation was for smaller ranges
      and an even spread is not representative of any workload that
      matters.  To improve this, we'd need to know the probability
      distribution of TLB flush range sizes for a set of workloads
      that are considered "common", build a synthetic trace and feed
      that into this benchmark.  Even that is not perfect because it
      would not account for the time between flushes but there are
      limits of what can be reasonably done and still be doing
      something useful.  If a representative synthetic trace is
      provided then this benchmark could be revisited and the shift values retuned.
      
      Ivybridge 4 threads
                              3.13.0-rc7            3.13.0-rc7
                                 vanilla           altshift-v3
      Mean       1       10.50 (  0.00%)       10.50 (  0.03%)
      Mean       2       17.59 (  0.00%)       17.18 (  2.34%)
      Mean       3       22.98 (  0.00%)       21.74 (  5.41%)
      Mean       5       47.13 (  0.00%)       46.23 (  1.92%)
      Mean       8       43.30 (  0.00%)       42.56 (  1.72%)
      
      Ivybridge 8 threads
                               3.13.0-rc7            3.13.0-rc7
                                  vanilla           altshift-v3
      Mean       1         9.45 (  0.00%)        9.36 (  0.93%)
      Mean       2         9.37 (  0.00%)        9.70 ( -3.54%)
      Mean       3         9.36 (  0.00%)        9.29 (  0.70%)
      Mean       5        14.49 (  0.00%)       15.04 ( -3.75%)
      Mean       8        41.08 (  0.00%)       38.73 (  5.71%)
      Mean       13       32.04 (  0.00%)       31.24 (  2.49%)
      Mean       16       40.05 (  0.00%)       39.04 (  2.51%)
      
      For both CPUs, average access time is reduced which is good as
      this is the benchmark that was used to tune the shift values in
      the first place albeit it is now known *how* the benchmark was
      used.
      
      The scheduler benchmarks were somewhat inconclusive.  They
      showed gains and losses and makes me reconsider how stable those
      benchmarks really are or if something else might be interfering
      with the test results recently.
      
      Network benchmarks were inconclusive.  Almost all results were
      flat except for netperf-udp tests on the 4 thread machine.
      These results were unstable and showed large variations between
      reboots.  It is unknown if this is a recent problems but I've
      noticed before that netperf-udp results tend to vary.
      
      Based on these results, changing the default for Ivybridge seems
      like a logical choice.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Tested-by: default avatarDavidlohr Bueso <davidlohr@hp.com>
      Reviewed-by: default avatarAlex Shi <alex.shi@linaro.org>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-cqnadffh1tiqrshthRj3Esge@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f98b7a77
    • Mel Gorman's avatar
      x86/mm: Eliminate redundant page table walk during TLB range flushing · 71b54f82
      Mel Gorman authored
      When choosing between doing an address space or ranged flush,
      the x86 implementation of flush_tlb_mm_range takes into account
      whether there are any large pages in the range.  A per-page
      flush typically requires fewer entries than would covered by a
      single large page and the check is redundant.
      
      There is one potential exception.  THP migration flushes single
      THP entries and it conceivably would benefit from flushing a
      single entry instead of the mm.  However, this flush is after a
      THP allocation, copy and page table update potentially with any
      other threads serialised behind it.  In comparison to that, the
      flush is noise.  It makes more sense to optimise balancing to
      require fewer flushes than to optimise the flush itself.
      
      This patch deletes the redundant huge page check.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Tested-by: default avatarDavidlohr Bueso <davidlohr@hp.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Alex Shi <alex.shi@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-sgei1drpOcburujPsfh6ovmo@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      71b54f82
    • Mel Gorman's avatar
      x86/mm: Clean up inconsistencies when flushing TLB ranges · 15aa3682
      Mel Gorman authored
      NR_TLB_LOCAL_FLUSH_ALL is not always accounted for correctly and
      the comparison with total_vm is done before taking
      tlb_flushall_shift into account.  Clean it up.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Tested-by: default avatarDavidlohr Bueso <davidlohr@hp.com>
      Reviewed-by: default avatarAlex Shi <alex.shi@linaro.org>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Hugh Dickins <hughd@google.com>
      Link: http://lkml.kernel.org/n/tip-Iz5gcahrgskIldvukulzi0hh@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      15aa3682
    • Mel Gorman's avatar
      mm, x86: Account for TLB flushes only when debugging · ec659934
      Mel Gorman authored
      Bisection between 3.11 and 3.12 fingered commit 9824cf97 ("mm:
      vmstats: tlb flush counters") to cause overhead problems.
      
      The counters are undeniably useful but how often do we really
      need to debug TLB flush related issues?  It does not justify
      taking the penalty everywhere so make it a debugging option.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Tested-by: default avatarDavidlohr Bueso <davidlohr@hp.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Alex Shi <alex.shi@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-XzxjntugxuwpxXhcrxqqh53b@git.kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ec659934
    • Dan Carpenter's avatar
      x86/AMD/NB: Fix amd_set_subcaches() parameter type · 2993ae33
      Dan Carpenter authored
      This is under CAP_SYS_ADMIN, but Smatch complains that mask comes
      from the user and the test for "mask > 0xf" can underflow.
      
      The fix is simple: amd_set_subcaches() should hand down not an 'int'
      but an 'unsigned long' like it was originally indended to do.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Daniel J Blueman <daniel@numascale-asia.com>
      Link: http://lkml.kernel.org/r/20140121072209.GA22095@elgon.mountainSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2993ae33
    • Aravind Gopalakrishnan's avatar
      x86/quirks: Add workaround for AMD F16h Erratum792 · fb53a1ab
      Aravind Gopalakrishnan authored
      The workaround for this Erratum is included in AGESA. But BIOSes
      spun only after Jan2014 will have the fix (atleast server
      versions of the chip). The erratum affects both embedded and
      server platforms and since we cannot say with certainity that
      ALL BIOSes on systems out in the field will have the fix, we
      should probably insulate ourselves in case BIOS does not do the
      right thing or someone is using old BIOSes.
      
      Refer to Revision Guide for AMD F16h models 00h-0fh, document 51810
      Rev. 3.04, November2013 for details on the Erratum.
      
      Tested the patch on Fam16h server platform and it works fine.
      Signed-off-by: default avatarAravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
      Cc: <hmh@hmh.eng.br>
      Cc: <Kim.Naru@amd.com>
      Cc: <Suravee.Suthikulpanit@amd.com>
      Cc: <bp@suse.de>
      Cc: <sherry.hurwitz@amd.com>
      Link: http://lkml.kernel.org/r/1390515212-1824-1-git-send-email-Aravind.Gopalakrishnan@amd.com
      [ Minor edits. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fb53a1ab
  2. 20 Jan, 2014 1 commit
  3. 16 Jan, 2014 1 commit
    • Prarit Bhargava's avatar
      x86: Add check for number of available vectors before CPU down · da6139e4
      Prarit Bhargava authored
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791
      
      When a cpu is downed on a system, the irqs on the cpu are assigned to
      other cpus.  It is possible, however, that when a cpu is downed there
      aren't enough free vectors on the remaining cpus to account for the
      vectors from the cpu that is being downed.
      
      This results in an interesting "overflow" condition where irqs are
      "assigned" to a CPU but are not handled.
      
      For example, when downing cpus on a 1-64 logical processor system:
      
      <snip>
      [  232.021745] smpboot: CPU 61 is now offline
      [  238.480275] smpboot: CPU 62 is now offline
      [  245.991080] ------------[ cut here ]------------
      [  245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250()
      [  246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out
      [  246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas
      [  246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14
      [  246.044451] Hardware name: Intel Corporation S4600LH ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
      [  246.057371]  0000000000000009 ffff88081fa03d40 ffffffff8164fbf6 ffff88081fa0ee48
      [  246.065728]  ffff88081fa03d90 ffff88081fa03d80 ffffffff81054ecc ffff88081fa13040
      [  246.074073]  0000000000000000 ffff88200cce0000 0000000000000040 0000000000000000
      [  246.082430] Call Trace:
      [  246.085174]  <IRQ>  [<ffffffff8164fbf6>] dump_stack+0x46/0x58
      [  246.091633]  [<ffffffff81054ecc>] warn_slowpath_common+0x8c/0xc0
      [  246.098352]  [<ffffffff81054fb6>] warn_slowpath_fmt+0x46/0x50
      [  246.104786]  [<ffffffff815710d6>] dev_watchdog+0x246/0x250
      [  246.110923]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
      [  246.119097]  [<ffffffff8106092a>] call_timer_fn+0x3a/0x110
      [  246.125224]  [<ffffffff8106280f>] ? update_process_times+0x6f/0x80
      [  246.132137]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
      [  246.140308]  [<ffffffff81061db0>] run_timer_softirq+0x1f0/0x2a0
      [  246.146933]  [<ffffffff81059a80>] __do_softirq+0xe0/0x220
      [  246.152976]  [<ffffffff8165fedc>] call_softirq+0x1c/0x30
      [  246.158920]  [<ffffffff810045f5>] do_softirq+0x55/0x90
      [  246.164670]  [<ffffffff81059d35>] irq_exit+0xa5/0xb0
      [  246.170227]  [<ffffffff8166062a>] smp_apic_timer_interrupt+0x4a/0x60
      [  246.177324]  [<ffffffff8165f40a>] apic_timer_interrupt+0x6a/0x70
      [  246.184041]  <EOI>  [<ffffffff81505a1b>] ? cpuidle_enter_state+0x5b/0xe0
      [  246.191559]  [<ffffffff81505a17>] ? cpuidle_enter_state+0x57/0xe0
      [  246.198374]  [<ffffffff81505b5d>] cpuidle_idle_call+0xbd/0x200
      [  246.204900]  [<ffffffff8100b7ae>] arch_cpu_idle+0xe/0x30
      [  246.210846]  [<ffffffff810a47b0>] cpu_startup_entry+0xd0/0x250
      [  246.217371]  [<ffffffff81646b47>] rest_init+0x77/0x80
      [  246.223028]  [<ffffffff81d09e8e>] start_kernel+0x3ee/0x3fb
      [  246.229165]  [<ffffffff81d0989f>] ? repair_env_string+0x5e/0x5e
      [  246.235787]  [<ffffffff81d095a5>] x86_64_start_reservations+0x2a/0x2c
      [  246.242990]  [<ffffffff81d0969f>] x86_64_start_kernel+0xf8/0xfc
      [  246.249610] ---[ end trace fb74fdef54d79039 ]---
      [  246.254807] ixgbe 0000:c2:00.0 p786p1: initiating reset due to tx timeout
      [  246.262489] ixgbe 0000:c2:00.0 p786p1: Reset adapter
      Last login: Mon Nov 11 08:35:14 from 10.18.17.119
      [root@(none) ~]# [  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
      [  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
      [  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
      [  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
      
      (last lines keep repeating.  ixgbe driver is dead until module reload.)
      
      If the downed cpu has more vectors than are free on the remaining cpus on the
      system, it is possible that some vectors are "orphaned" even though they are
      assigned to a cpu.  In this case, since the ixgbe driver had a watchdog, the
      watchdog fired and notified that something was wrong.
      
      This patch adds a function, check_vectors(), to compare the number of vectors
      on the CPU going down and compares it to the number of vectors available on
      the system.  If there aren't enough vectors for the CPU to go down, an
      error is returned and propogated back to userspace.
      
      v2: Do not need to look at percpu irqs
      v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest
          Priority Mode
      v4: Additional changes suggested by Gong Chen.
      v5/v6/v7/v8: Updated comment text
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-prarit@redhat.comReviewed-by: default avatarGong Chen <gong.chen@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Seiji Aguchi <seiji.aguchi@hds.com>
      Cc: Yang Zhang <yang.z.zhang@Intel.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Janet Morgan <janet.morgan@intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Ruiv Wang <ruiv.wang@gmail.com>
      Cc: Gong Chen <gong.chen@linux.intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      da6139e4
  4. 15 Jan, 2014 1 commit
  5. 13 Jan, 2014 2 commits
  6. 12 Jan, 2014 6 commits
    • Benjamin Herrenschmidt's avatar
      powerpc: Check return value of instance-to-package OF call · 10348f59
      Benjamin Herrenschmidt authored
      On PA-Semi firmware, the instance-to-package callback doesn't seem
      to be implemented. We didn't check for error, however, thus
      subsequently passed the -1 value returned into stdout_node to
      thins like prom_getprop etc...
      
      Thus caused the firmware to load values around 0 (physical) internally
      as node structures. It somewhat "worked" as long as we had a NULL in the
      right place (address 8) at the beginning of the kernel, we didn't "see"
      the bug. But commit 5c0484e2
      "powerpc: Endian safe trampoline" changed the kernel entry point causing
      that old bug to now cause a crash early during boot.
      
      This fixes booting on PA-Semi board by properly checking the return
      value from instance-to-package.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Tested-by: default avatarOlof Johansson <olof@lixom.net>
      ---
      10348f59
    • Linus Torvalds's avatar
      Linux 3.13-rc8 · 7e22e911
      Linus Torvalds authored
      7e22e911
    • Steven Rostedt's avatar
      SELinux: Fix possible NULL pointer dereference in selinux_inode_permission() · 3dc91d43
      Steven Rostedt authored
      While running stress tests on adding and deleting ftrace instances I hit
      this bug:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
        IP: selinux_inode_permission+0x85/0x160
        PGD 63681067 PUD 7ddbe067 PMD 0
        Oops: 0000 [#1] PREEMPT
        CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
        Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
        task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
        RIP: 0010:[<ffffffff812d8bc5>]  [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
        RSP: 0018:ffff88007ddb1c48  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
        RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
        RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
        R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
        R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
        FS:  00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
        CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
        Call Trace:
          security_inode_permission+0x1c/0x30
          __inode_permission+0x41/0xa0
          inode_permission+0x18/0x50
          link_path_walk+0x66/0x920
          path_openat+0xa6/0x6c0
          do_filp_open+0x43/0xa0
          do_sys_open+0x146/0x240
          SyS_open+0x1e/0x20
          system_call_fastpath+0x16/0x1b
        Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
        RIP  selinux_inode_permission+0x85/0x160
        CR2: 0000000000000020
      
      Investigating, I found that the inode->i_security was NULL, and the
      dereference of it caused the oops.
      
      in selinux_inode_permission():
      
      	isec = inode->i_security;
      
      	rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
      
      Note, the crash came from stressing the deletion and reading of debugfs
      files.  I was not able to recreate this via normal files.  But I'm not
      sure they are safe.  It may just be that the race window is much harder
      to hit.
      
      What seems to have happened (and what I have traced), is the file is
      being opened at the same time the file or directory is being deleted.
      As the dentry and inode locks are not held during the path walk, nor is
      the inodes ref counts being incremented, there is nothing saving these
      structures from being discarded except for an rcu_read_lock().
      
      The rcu_read_lock() protects against freeing of the inode, but it does
      not protect freeing of the inode_security_struct.  Now if the freeing of
      the i_security happens with a call_rcu(), and the i_security field of
      the inode is not changed (it gets freed as the inode gets freed) then
      there will be no issue here.  (Linus Torvalds suggested not setting the
      field to NULL such that we do not need to check if it is NULL in the
      permission check).
      
      Note, this is a hack, but it fixes the problem at hand.  A real fix is
      to restructure the destroy_inode() to call all the destructor handlers
      from the RCU callback.  But that is a major job to do, and requires a
      lot of work.  For now, we just band-aid this bug with this fix (it
      works), and work on a more maintainable solution in the future.
      
      Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home
      Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3dc91d43
    • Hugh Dickins's avatar
      thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only · eecc1e42
      Hugh Dickins authored
      We see General Protection Fault on RSI in copy_page_rep: that RSI is
      what you get from a NULL struct page pointer.
      
        RIP: 0010:[<ffffffff81154955>]  [<ffffffff81154955>] copy_page_rep+0x5/0x10
        RSP: 0000:ffff880136e15c00  EFLAGS: 00010286
        RAX: ffff880000000000 RBX: ffff880136e14000 RCX: 0000000000000200
        RDX: 6db6db6db6db6db7 RSI: db73880000000000 RDI: ffff880dd0c00000
        RBP: ffff880136e15c18 R08: 0000000000000200 R09: 000000000005987c
        R10: 000000000005987c R11: 0000000000000200 R12: 0000000000000001
        R13: ffffea00305aa000 R14: 0000000000000000 R15: 0000000000000000
        FS:  00007f195752f700(0000) GS:ffff880c7fc20000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000093010000 CR3: 00000001458e1000 CR4: 00000000000027e0
        Call Trace:
          copy_user_huge_page+0x93/0xab
          do_huge_pmd_wp_page+0x710/0x815
          handle_mm_fault+0x15d8/0x1d70
          __do_page_fault+0x14d/0x840
          do_page_fault+0x2f/0x90
          page_fault+0x22/0x30
      
      do_huge_pmd_wp_page() tests is_huge_zero_pmd(orig_pmd) four times: but
      since shrink_huge_zero_page() can free the huge_zero_page, and we have
      no hold of our own on it here (except where the fourth test holds
      page_table_lock and has checked pmd_same), it's possible for it to
      answer yes the first time, but no to the second or third test.  Change
      all those last three to tests for NULL page.
      
      (Note: this is not the same issue as trinity's DEBUG_PAGEALLOC BUG
      in copy_page_rep with RSI: ffff88009c422000, reported by Sasha Levin
      in https://lkml.org/lkml/2013/3/29/103.  I believe that one is due
      to the source page being split, and a tail page freed, while copy
      is in progress; and not a problem without DEBUG_PAGEALLOC, since
      the pmd_same check will prevent a miscopy from being made visible.)
      
      Fixes: 97ae1749 ("thp: implement refcounting for huge zero page")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org # v3.10 v3.11 v3.12
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eecc1e42
    • Ming Lei's avatar
      block: null_blk: fix queue leak inside removing device · 518d00b7
      Ming Lei authored
      When queue_mode is NULL_Q_MQ and null_blk is being removed,
      blk_cleanup_queue() isn't called to cleanup queue, so the queue
      allocated won't be freed.
      
      This patch calls blk_cleanup_queue() for MQ to drain all pending
      requests first and release the reference counter of queue kobject, then
      blk_mq_free_queue() will be called in queue kobject's release handler
      when queue kobject's reference counter drops to zero.
      Signed-off-by: default avatarMing Lei <tom.leiming@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      518d00b7
    • Linus Torvalds's avatar
      x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround · 26bef131
      Linus Torvalds authored
      Before we do an EMMS in the AMD FXSAVE information leak workaround we
      need to clear any pending exceptions, otherwise we trap with a
      floating-point exception inside this code.
      Reported-by: default avatarhalfdog <me@halfdog.net>
      Tested-by: default avatarBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/CA%2B55aFxQnY_PCG_n4=0w-VG=YLXL-yr7oMxyy0WU2gCBAf3ydg@mail.gmail.comSigned-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      26bef131
  7. 10 Jan, 2014 20 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 228fdc08
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Famouse last words: "final pull request" :-)
      
        I'm sending this because Jason Wang's fixes are pretty important
      
         1) Add missing per-cpu stats initialization to ip6_vti.  Otherwise
            lockdep spits out a call trace.  From Li RongQing.
      
         2) Fix NULL oops in wireless hwsim, from Javier Lopez
      
         3) TIPC deferred packet queue unlink must NULL out skb->next to avoid
            crashes.  From Erik Hugne
      
         4) Fix access to uninitialized buffer in nf_nat netfilter code, from
            Daniel Borkmann
      
         5) Fix lifetime of ipv6 loopback and SIT tunnel addresses, otherwise
            they basically timeout immediately.  From Hannes Frederic Sowa
      
         6) Fix DMA unmapping of TSO packets in bnx2x driver, from Michal
            Schmidt
      
         7) Do not allow L2 forwarding offload via macvtap device, the way
            things are now it will not end up being forwaded at all.  From
            Jason Wang
      
         8) Fix transmit queue selection via ndo_dfwd_start_xmit(), fixing
            things like applying NETIF_F_LLTX to the wrong device (!!) and
            eliding the proper transmit watchdog handling
      
         9) qlcnic driver was not updating tx statistics at all, from Manish
            Chopra"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        qlcnic: Fix ethtool statistics length calculation
        qlcnic: Fix bug in TX statistics
        net: core: explicitly select a txq before doing l2 forwarding
        macvlan: forbid L2 fowarding offload for macvtap
        bnx2x: fix DMA unmapping of TSO split BDs
        ipv6: add link-local, sit and loopback address with INFINITY_LIFE_TIME
        bnx2x: prevent WARN during driver unload
        tipc: correctly unlink packets from deferred packet queue
        ipv6: pcpu_tstats.syncp should be initialised in ip6_vti.c
        netfilter: only warn once on wrong seqadj usage
        netfilter: nf_nat: fix access to uninitialized buffer in IRC NAT helper
        NFC: Fix target mode p2p link establishment
        iwlwifi: add new devices for 7265 series
        mac80211: move "bufferable MMPDU" check to fix AP mode scan
        mac80211_hwsim: Fix NULL pointer dereference
      228fdc08
    • Linus Torvalds's avatar
      Merge tag 'xfs-for-linus-v3.13-rc8' of git://oss.sgi.com/xfs/xfs · e2bc4470
      Linus Torvalds authored
      Pull xfs bugfixes from Ben Myers:
       "Here we have a bugfix for an off-by-one in the remote attribute
        verifier that results in a forced shutdown which you can hit with v5
        superblock by creating a 64k xattr, and a fix for a missing
        destroy_work_on_stack() in the allocation worker.
      
        It's a bit late, but they are both fairly straightforward"
      
      * tag 'xfs-for-linus-v3.13-rc8' of git://oss.sgi.com/xfs/xfs:
        xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK()
        xfs: fix off-by-one error in xfs_attr3_rmt_verify
      e2bc4470
    • Linus Torvalds's avatar
      Merge branch 'leds-fixes-for-3.13' of... · 324c66ff
      Linus Torvalds authored
      Merge branch 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds
      
      Pull LED fix from Bryan Wu:
       "Pali Rohár and Pavel Machek reported the LED of Nokia N900 doesn't
        work with our latest 3.13-rc6 kernel.  Milo fixed the regression here"
      
      * 'leds-fixes-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
        leds: lp5521/5523: Remove duplicate mutex
      324c66ff
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-3.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · cff539b1
      Linus Torvalds authored
      Pull ACPI and power management fixes from Rafael Wysocki:
      
       - Recent commits modifying the lists of C-states in the intel_idle
         driver introduced bugs leading to crashes on some systems.  Two fixes
         from Jiang Liu.
      
       - The ACPI AC driver should receive all types of notifications, but
         recent change made it ignore some of them.  Fix from Alexander Mezin.
      
       - intel_pstate's validity checks for MSRs it depends on are not
         sufficient to catch the lack of support in nested KVM setups, so they
         are extended to cover that case.  From Dirk Brandewie.
      
       - NEC LZ750/LS has a botched up _BIX method in its ACPI tables, so our
         ACPI battery driver needs a quirk for it.  From Lan Tianyu.
      
       - The tpm_ppi driver sometimes leaks memory allocated by
         acpi_get_name().  Fix from Jiang Liu.
      
      * tag 'pm+acpi-3.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        intel_idle: close avn_cstates array with correct marker
        Revert "intel_idle: mark states tables with __initdata tag"
        ACPI / Battery: Add a _BIX quirk for NEC LZ750/LS
        intel_pstate: Add X86_FEATURE_APERFMPERF to cpu match parameters.
        ACPI / TPM: fix memory leak when walking ACPI namespace
        ACPI / AC: change notification handler type to ACPI_ALL_NOTIFY
      cff539b1
    • Linus Torvalds's avatar
      Merge tag 'mfd-fixes-3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes · c43a5eb2
      Linus Torvalds authored
      Pull MFD fix from Samuel Ortiz:
       "This is the 2nd MFD pull request for 3.13
      
        It only contains one fix for the rtsx_pcr driver.  Without it we see a
        kernel panic on some machines, when resuming from suspend to RAM"
      
      * tag 'mfd-fixes-3.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-fixes:
        mfd: rtsx_pcr: Disable interrupts before cancelling delayed works
      c43a5eb2
    • Milo Kim's avatar
      leds: lp5521/5523: Remove duplicate mutex · e70988d1
      Milo Kim authored
      It can be a problem when a pattern is loaded via the firmware interface.
      LP55xx common driver has already locked the mutex in 'lp55xx_firmware_loaded()'.
      So it should be deleted.
      
      On the other hand, locks are required in store_engine_load()
      on updating program memory.
      Reported-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Reported-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarMilo Kim <milo.kim@ti.com>
      Signed-off-by: default avatarBryan Wu <cooloney@gmail.com>
      Cc: <stable@vger.kernel.org>
      e70988d1
    • Chuansheng Liu's avatar
      xfs: Calling destroy_work_on_stack() to pair with INIT_WORK_ONSTACK() · 1f4a63bf
      Chuansheng Liu authored
      In case CONFIG_DEBUG_OBJECTS_WORK is defined, it is needed to
      call destroy_work_on_stack() which frees the debug object to pair
      with INIT_WORK_ONSTACK().
      Signed-off-by: default avatarLiu, Chuansheng <chuansheng.liu@intel.com>
      Reviewed-by: default avatarBen Myers <bpm@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 6f96b306)
      1f4a63bf
    • Jie Liu's avatar
      xfs: fix off-by-one error in xfs_attr3_rmt_verify · bba719b5
      Jie Liu authored
      With CRC check is enabled, if trying to set an attributes value just
      equal to the maximum size of XATTR_SIZE_MAX would cause the v3 remote
      attr write verification procedure failure, which would yield the back
      trace like below:
      
      <snip>
      XFS (sda7): Internal error xfs_attr3_rmt_write_verify at line 191 of file fs/xfs/xfs_attr_remote.c
      <snip>
      Call Trace:
      [<ffffffff816f0042>] dump_stack+0x45/0x56
      [<ffffffffa0d99c8b>] xfs_error_report+0x3b/0x40 [xfs]
      [<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs]
      [<ffffffffa0d99ce5>] xfs_corruption_error+0x55/0x80 [xfs]
      [<ffffffffa0dbef6b>] xfs_attr3_rmt_write_verify+0x14b/0x1a0 [xfs]
      [<ffffffffa0d96edd>] ? _xfs_buf_ioapply+0x6d/0x390 [xfs]
      [<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs]
      [<ffffffffa0d96edd>] _xfs_buf_ioapply+0x6d/0x390 [xfs]
      [<ffffffff81184cda>] ? vm_map_ram+0x31a/0x460
      [<ffffffff81097230>] ? wake_up_state+0x20/0x20
      [<ffffffffa0d97315>] ? xfs_bdstrat_cb+0x55/0xb0 [xfs]
      [<ffffffffa0d9726b>] xfs_buf_iorequest+0x6b/0xc0 [xfs]
      [<ffffffffa0d97315>] xfs_bdstrat_cb+0x55/0xb0 [xfs]
      [<ffffffffa0d97906>] xfs_bwrite+0x46/0x80 [xfs]
      [<ffffffffa0dbfa94>] xfs_attr_rmtval_set+0x334/0x490 [xfs]
      [<ffffffffa0db84aa>] xfs_attr_leaf_addname+0x24a/0x410 [xfs]
      [<ffffffffa0db8893>] xfs_attr_set_int+0x223/0x470 [xfs]
      [<ffffffffa0db8b76>] xfs_attr_set+0x96/0xb0 [xfs]
      [<ffffffffa0db13b2>] xfs_xattr_set+0x42/0x70 [xfs]
      [<ffffffff811df9b2>] generic_setxattr+0x62/0x80
      [<ffffffff811e0213>] __vfs_setxattr_noperm+0x63/0x1b0
      [<ffffffff81307afe>] ? evm_inode_setxattr+0xe/0x10
      [<ffffffff811e0415>] vfs_setxattr+0xb5/0xc0
      [<ffffffff811e054e>] setxattr+0x12e/0x1c0
      [<ffffffff811c6e82>] ? final_putname+0x22/0x50
      [<ffffffff811c708b>] ? putname+0x2b/0x40
      [<ffffffff811cc4bf>] ? user_path_at_empty+0x5f/0x90
      [<ffffffff811bdfd9>] ? __sb_start_write+0x49/0xe0
      [<ffffffff81168589>] ? vm_mmap_pgoff+0x99/0xc0
      [<ffffffff811e07df>] SyS_setxattr+0x8f/0xe0
      [<ffffffff81700c2d>] system_call_fastpath+0x1a/0x1f
      
      Tests:
          setfattr -n user.longxattr -v `perl -e 'print "A"x65536'` testfile
      
      This patch fix it to check the remote EA size is greater than the
      XATTR_SIZE_MAX rather than more than or equal to it, because it's
      valid if the specified EA value size is equal to the limitation as
      per VFS setxattr interface.
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarMark Tinguely <tinguely@sgi.com>
      Signed-off-by: default avatarBen Myers <bpm@sgi.com>
      
      (cherry picked from commit 85dd0707)
      bba719b5
    • Shahed Shaikh's avatar
      qlcnic: Fix ethtool statistics length calculation · d6e9c89a
      Shahed Shaikh authored
      o Consider number of Tx queues while calculating the length of
        Tx statistics as part of ethtool stats.
      o Calculate statistics lenght properly for 82xx and 83xx adapter
      Signed-off-by: default avatarShahed Shaikh <shahed.shaikh@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6e9c89a
    • Manish Chopra's avatar
      qlcnic: Fix bug in TX statistics · 1ac6762a
      Manish Chopra authored
      o Driver was not updating TX stats so it was not populating
        statistics in `ifconfig` command output.
      Signed-off-by: default avatarManish Chopra <manish.chopra@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ac6762a
    • Jason Wang's avatar
      net: core: explicitly select a txq before doing l2 forwarding · f663dd9a
      Jason Wang authored
      Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
      will cause several issues:
      
      - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
        instead of lower device which misses the necessary txq synchronization for
        lower device such as txq stopping or frozen required by dev watchdog or
        control path.
      - dev_hard_start_xmit() was called with NULL txq which bypasses the net device
        watchdog.
      - dev_hard_start_xmit() does not check txq everywhere which will lead a crash
        when tso is disabled for lower device.
      
      Fix this by explicitly introducing a new param for .ndo_select_queue() for just
      selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
      extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
      forwarding transmission.
      
      With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
      to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
      a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
      dev_queue_xmit() to do the transmission.
      
      In the future, it was also required for macvtap l2 forwarding support since it
      provides a necessary synchronization method.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: e1000-devel@lists.sourceforge.net
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f663dd9a
    • Jason Wang's avatar
      macvlan: forbid L2 fowarding offload for macvtap · b13ba1b8
      Jason Wang authored
      L2 fowarding offload will bypass the rx handler of real device. This will make
      the packet could not be forwarded to macvtap device. Another problem is the
      dev_hard_start_xmit() called for macvtap does not have any synchronization.
      
      Fix this by forbidding L2 forwarding for macvtap.
      
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarJohn Fastabend <john.r.fastabend@intel.com.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b13ba1b8
    • David S. Miller's avatar
      Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · c4d70998
      David S. Miller authored
      John W. Linville says:
      
      ====================
      For the mac80211 bits, Johannes says:
      
      "I have a fix from Javier for mac80211_hwsim when used with wmediumd
      userspace, and a fix from Felix for buffering in AP mode."
      
      For the NFC bits, Samuel says:
      
      "This pull request only contains one fix for a regression introduced with
      commit e29a9e2a. Without this fix, we can not establish a p2p link
      in target mode. Only initiator mode works."
      
      For the iwlwifi bits, Emmanuel says:
      
      "It only includes new device IDs so it's not vital. If you have a pull
      request to net.git anyway, I'd happy to have this in."
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4d70998
    • Michal Schmidt's avatar
      bnx2x: fix DMA unmapping of TSO split BDs · 95e92fd4
      Michal Schmidt authored
      bnx2x triggers warnings with CONFIG_DMA_API_DEBUG=y:
      
        WARNING: CPU: 0 PID: 2253 at lib/dma-debug.c:887 check_unmap+0xf8/0x920()
        bnx2x 0000:28:00.0: DMA-API: device driver frees DMA memory with
        different size [device address=0x00000000da2b389e] [map size=1490 bytes]
        [unmap size=66 bytes]
      
      The reason is that bnx2x splits a TSO BD into two BDs (headers + data)
      using one DMA mapping for both, but it uses only the length of the first
      BD when unmapping.
      
      This patch fixes the bug by unmapping the whole length of the two BDs.
      Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95e92fd4
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux · 21e20e22
      Linus Torvalds authored
      Pull clock fixes from Mike Turquette:
       "Late fixes for clock drivers.  All of these fixes are for user-visible
        regressions, typically boot failures or other unsafe system
        configuration that causes badness"
      
      * tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
        clk: clk-divider: fix divisor > 255 bug
        clk: exynos: File scope reg_save array should depend on PM_SLEEP
        clk: samsung: exynos5250: Add CLK_IGNORE_UNUSED flag for the sysreg clock
        ARM: dts: exynos5250: Fix MDMA0 clock number
        clk: samsung: exynos5250: Add MDMA0 clocks
        clk: samsung: exynos5250: Fix ACP gate register offset
        clk: exynos5250: fix sysmmu_mfc{l,r} gate clocks
        clk: samsung: exynos4: Correct SRC_MFC register
      21e20e22
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 2aa63ce0
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A few fixes for Renesas platforms to fixup DMA masks (this started
        causing errors once the DMA API added checks for valid masks in 3.13)"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: shmobile: mackerel: Fix coherent DMA mask
        ARM: shmobile: kzm9g: Fix coherent DMA mask
        ARM: shmobile: armadillo: Fix coherent DMA mask
      2aa63ce0
    • Hannes Frederic Sowa's avatar
      ipv6: add link-local, sit and loopback address with INFINITY_LIFE_TIME · 07edd741
      Hannes Frederic Sowa authored
      In the past the IFA_PERMANENT flag indicated, that the valid and preferred
      lifetime where ignored. Since change fad8da3e ("ipv6 addrconf: fix
      preferred lifetime state-changing behavior while valid_lft is infinity")
      we honour at least the preferred lifetime on those addresses. As such
      the valid lifetime gets recalculated and updated to 0.
      
      If loopback address is added manually this problem does not occur.
      Also if NetworkManager manages IPv6, those addresses will get added via
      inet6_rtm_newaddr and thus will have a correct lifetime, too.
      Reported-by: default avatarFrançois-Xavier Le Bail <fx.lebail@yahoo.com>
      Reported-by: default avatarDamien Wyart <damien.wyart@gmail.com>
      Fixes: fad8da3e ("ipv6 addrconf: fix preferred lifetime state-changing behavior while valid_lft is infinity")
      Cc: Yasushi Asano <yasushi.asano@jp.fujitsu.com>
      Signed-off-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07edd741
    • Yuval Mintz's avatar
      bnx2x: prevent WARN during driver unload · 9a2620c8
      Yuval Mintz authored
      Starting with commit 80c33ddd "net: add might_sleep() call to napi_disable"
      bnx2x fails the might_sleep tests causing a stack trace to appear whenever
      the driver is unloaded, as local_bh_disable() is being called before
      napi_disable().
      
      This changes the locking schematics related to CONFIG_NET_RX_BUSY_POLL,
      preventing the need for calling local_bh_disable() and thus eliminating
      the issue.
      Signed-off-by: default avatarYuval Mintz <yuvalmin@broadcom.com>
      Signed-off-by: default avatarDmitry Kravkov <dmitry@broadcom.com>
      Signed-off-by: default avatarAriel Elior <ariele@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a2620c8
    • Rafael J. Wysocki's avatar
      Merge branch 'pm-cpuidle' · 13de22c5
      Rafael J. Wysocki authored
      * pm-cpuidle:
        intel_idle: close avn_cstates array with correct marker
        Revert "intel_idle: mark states tables with __initdata tag"
      13de22c5
    • Jiang Liu's avatar
      intel_idle: close avn_cstates array with correct marker · 88390996
      Jiang Liu authored
      Close avn_cstates array with correct marker to avoid overflow
      in function intel_idle_cpu_init().
      
      [rjw: The problem was introduced when commit 22e580d0 was merged
       on top of eba682a5 (intel_idle: shrink states tables).]
      
      Fixes: 22e580d0 (intel_idle: Fixed C6 state on Avoton/Rangeley processors)
      Signed-off-by: default avatarJiang Liu <jiang.liu@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      88390996
  8. 09 Jan, 2014 1 commit