1. 12 Sep, 2014 18 commits
  2. 03 Sep, 2014 1 commit
  3. 27 Aug, 2014 21 commits
    • Boris Ostrovsky's avatar
      x86/espfix/xen: Fix allocation of pages for paravirt page tables · 5d1b4311
      Boris Ostrovsky authored
      commit 8762e509 upstream.
      
      init_espfix_ap() is currently off by one level when informing hypervisor
      that allocated pages will be used for ministacks' page tables.
      
      The most immediate effect of this on a PV guest is that if
      'stack_page = __get_free_page()' returns a non-zeroed-out page the hypervisor
      will refuse to use it for a page table (which it shouldn't be anyway). This will
      result in warnings by both Xen and Linux.
      
      More importantly, a subsequent write to that page (again, by a PV guest) is
      likely to result in fatal page fault.
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Link: http://lkml.kernel.org/r/1404926298-5565-1-git-send-email-boris.ostrovsky@oracle.comReviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d1b4311
    • willy tarreau's avatar
      net: mvneta: replace Tx timer with a real interrupt · 10d937e0
      willy tarreau authored
      Right now the mvneta driver doesn't handle Tx IRQ, and relies on two
      mechanisms to flush Tx descriptors : a flush at the end of mvneta_tx()
      and a timer. If a burst of packets is emitted faster than the device
      can send them, then the queue is stopped until next wake-up of the
      timer 10ms later. This causes jerky output traffic with bursts and
      pauses, making it difficult to reach line rate with very few streams.
      
      A test on UDP traffic shows that it's not possible to go beyond 134
      Mbps / 12 kpps of outgoing traffic with 1500-bytes IP packets. Routed
      traffic tends to observe pauses as well if the traffic is bursty,
      making it even burstier after the wake-up.
      
      It seems that this feature was inherited from the original driver but
      nothing there mentions any reason for not using the interrupt instead,
      which the chip supports.
      
      Thus, this patch enables Tx interrupts and removes the timer. It does
      the two at once because it's not really possible to make the two
      mechanisms coexist, so a split patch doesn't make sense.
      
      First tests performed on a Mirabox (Armada 370) show that less CPU
      seems to be used when sending traffic. One reason might be that we now
      call the mvneta_tx_done_gbe() with a mask indicating which queues have
      been done instead of looping over all of them.
      
      The same UDP test above now happily reaches 987 Mbps / 87.7 kpps.
      Single-stream TCP traffic can now more easily reach line rate. HTTP
      transfers of 1 MB objects over a single connection went from 730 to
      840 Mbps. It is even possible to go significantly higher (>900 Mbps)
      by tweaking tcp_tso_win_divisor.
      
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
      Cc: Arnaud Ebalard <arno@natisbad.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Tested-by: default avatarArnaud Ebalard <arno@natisbad.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 71f6d1b3)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      10d937e0
    • willy tarreau's avatar
      net: mvneta: add missing bit descriptions for interrupt masks and causes · 7a399737
      willy tarreau authored
      Marvell has not published the chip's datasheet yet, so it's very hard
      to find the relevant bits to manipulate to change the IRQ behaviour.
      Fortunately, these bits are described in the proprietary LSP patch set
      which is publicly available here :
      
          http://www.plugcomputer.org/downloads/mirabox/
      
      So let's put them back in the driver in order to reduce the burden of
      current and future maintenance.
      
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
      Tested-by: default avatarArnaud Ebalard <arno@natisbad.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 40ba35e7)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7a399737
    • willy tarreau's avatar
      net: mvneta: do not schedule in mvneta_tx_timeout · f1b0e3b7
      willy tarreau authored
      If a queue timeout is reported, we can oops because of some
      schedules while the caller is atomic, as shown below :
      
        mvneta d0070000.ethernet eth0: tx timeout
        BUG: scheduling while atomic: bash/1528/0x00000100
        Modules linked in: slhttp_ethdiv(C) [last unloaded: slhttp_ethdiv]
        CPU: 2 PID: 1528 Comm: bash Tainted: G        WC   3.13.0-rc4-mvebu-nf #180
        [<c0011bd9>] (unwind_backtrace+0x1/0x98) from [<c000f1ab>] (show_stack+0xb/0xc)
        [<c000f1ab>] (show_stack+0xb/0xc) from [<c02ad323>] (dump_stack+0x4f/0x64)
        [<c02ad323>] (dump_stack+0x4f/0x64) from [<c02abe67>] (__schedule_bug+0x37/0x4c)
        [<c02abe67>] (__schedule_bug+0x37/0x4c) from [<c02ae261>] (__schedule+0x325/0x3ec)
        [<c02ae261>] (__schedule+0x325/0x3ec) from [<c02adb97>] (schedule_timeout+0xb7/0x118)
        [<c02adb97>] (schedule_timeout+0xb7/0x118) from [<c0020a67>] (msleep+0xf/0x14)
        [<c0020a67>] (msleep+0xf/0x14) from [<c01dcbe5>] (mvneta_stop_dev+0x21/0x194)
        [<c01dcbe5>] (mvneta_stop_dev+0x21/0x194) from [<c01dcfe9>] (mvneta_tx_timeout+0x19/0x24)
        [<c01dcfe9>] (mvneta_tx_timeout+0x19/0x24) from [<c024afc7>] (dev_watchdog+0x18b/0x1c4)
        [<c024afc7>] (dev_watchdog+0x18b/0x1c4) from [<c0020b53>] (call_timer_fn.isra.27+0x17/0x5c)
        [<c0020b53>] (call_timer_fn.isra.27+0x17/0x5c) from [<c0020cad>] (run_timer_softirq+0x115/0x170)
        [<c0020cad>] (run_timer_softirq+0x115/0x170) from [<c001ccb9>] (__do_softirq+0xbd/0x1a8)
        [<c001ccb9>] (__do_softirq+0xbd/0x1a8) from [<c001cfad>] (irq_exit+0x61/0x98)
        [<c001cfad>] (irq_exit+0x61/0x98) from [<c000d4bf>] (handle_IRQ+0x27/0x60)
        [<c000d4bf>] (handle_IRQ+0x27/0x60) from [<c000843b>] (armada_370_xp_handle_irq+0x33/0xc8)
        [<c000843b>] (armada_370_xp_handle_irq+0x33/0xc8) from [<c000fba9>] (__irq_usr+0x49/0x60)
      
      Ben Hutchings attempted to propose a better fix consisting in using a
      scheduled work for this, but while it fixed this panic, it caused other
      random freezes and panics proving that the reset sequence in the driver
      is unreliable and that additional fixes should be investigated.
      
      When sending multiple streams over a link limited to 100 Mbps, Tx timeouts
      happen from time to time, and the driver correctly recovers only when the
      function is disabled.
      
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Tested-by: default avatarArnaud Ebalard <arno@natisbad.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 29021366)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      f1b0e3b7
    • willy tarreau's avatar
      net: mvneta: use per_cpu stats to fix an SMP lock up · 6a026d8f
      willy tarreau authored
      Stats writers are mvneta_rx() and mvneta_tx(). They don't lock anything
      when they update the stats, and as a result, it randomly happens that
      the stats freeze on SMP if two updates happen during stats retrieval.
      This is very easily reproducible by starting two HTTP servers and binding
      each of them to a different CPU, then consulting /proc/net/dev in loops
      during transfers, the interface should immediately lock up. This issue
      also randomly happens upon link state changes during transfers, because
      the stats are collected in this situation, but it takes more attempts to
      reproduce it.
      
      The comments in netdevice.h suggest using per_cpu stats instead to get
      rid of this issue.
      
      This patch implements this. It merges both rx_stats and tx_stats into
      a single "stats" member with a single syncp. Both mvneta_rx() and
      mvneta_rx() now only update the a single CPU's counters.
      
      In turn, mvneta_get_stats64() does the summing by iterating over all CPUs
      to get their respective stats.
      
      With this change, stats are still correct and no more lockup is encountered.
      
      Note that this bug was present since the first import of the mvneta
      driver.  It might make sense to backport it to some stable trees. If
      so, it depends on "d33dc73 net: mvneta: increase the 64-bit rx/tx stats
      out of the hot path".
      
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarArnaud Ebalard <arno@natisbad.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 74c41b04)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      6a026d8f
    • willy tarreau's avatar
      net: mvneta: increase the 64-bit rx/tx stats out of the hot path · 7d798913
      willy tarreau authored
      Better count packets and bytes in the stack and on 32 bit then
      accumulate them at the end for once. This saves two memory writes
      and two memory barriers per packet. The incoming packet rate was
      increased by 4.7% on the Openblocks AX3 thanks to this.
      
      Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarArnaud Ebalard <arno@natisbad.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit dc4277dd)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7d798913
    • Johannes Berg's avatar
      Revert "mac80211: move "bufferable MMPDU" check to fix AP mode scan" · 9e1bbd3f
      Johannes Berg authored
      This reverts commit 277d916f as it was
      at least breaking iwlwifi by setting the IEEE80211_TX_CTL_NO_PS_BUFFER
      flag in all kinds of interface modes, not only for AP mode where it is
      appropriate.
      
      To avoid reintroducing the original problem, explicitly check for probe
      request frames in the multicast buffering code.
      
      Cc: stable@vger.kernel.org
      Fixes: 277d916f ("mac80211: move "bufferable MMPDU" check to fix AP mode scan")
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      
      (cherry picked from commit 08b99399)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      9e1bbd3f
    • Andy Lutomirski's avatar
      x86_64/entry/xen: Do not invoke espfix64 on Xen · 350cc3fc
      Andy Lutomirski authored
      This moves the espfix64 logic into native_iret.  To make this work,
      it gets rid of the native patch for INTERRUPT_RETURN:
      INTERRUPT_RETURN on native kernels is now 'jmp native_iret'.
      
      This changes the 16-bit SS behavior on Xen from OOPSing to leaking
      some bits of the Xen hypervisor's RSP (I think).
      
      [ hpa: this is a nonzero cost on native, but probably not enough to
        measure. Xen needs to fix this in their own code, probably doing
        something equivalent to espfix64. ]
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Link: http://lkml.kernel.org/r/7b8f1d8ef6597cb16ae004a43c56980a7de3cf94.1406129132.git.luto@amacapital.netSigned-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      
      (cherry picked from commit 7209a75d)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      350cc3fc
    • H. Peter Anvin's avatar
      x86, espfix: Make it possible to disable 16-bit support · 5190bdb7
      H. Peter Anvin authored
      Embedded systems, which may be very memory-size-sensitive, are
      extremely unlikely to ever encounter any 16-bit software, so make it
      a CONFIG_EXPERT option to turn off support for any 16-bit software
      whatsoever.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
      
      (cherry picked from commit 34273f41)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5190bdb7
    • H. Peter Anvin's avatar
      x86, espfix: Make espfix64 a Kconfig option, fix UML · 54d052e0
      H. Peter Anvin authored
      Make espfix64 a hidden Kconfig option.  This fixes the x86-64 UML
      build which had broken due to the non-existence of init_espfix_bsp()
      in UML: since UML uses its own Kconfig, this option does not appear in
      the UML build.
      
      This also makes it possible to make support for 16-bit segments a
      configuration option, for the people who want to minimize the size of
      the kernel.
      Reported-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Richard Weinberger <richard@nod.at>
      Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
      
      (cherry picked from commit 197725de)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      54d052e0
    • H. Peter Anvin's avatar
      x86, espfix: Fix broken header guard · 04d5b66d
      H. Peter Anvin authored
      Header guard is #ifndef, not #ifdef...
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      
      (cherry picked from commit 20b68535)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      04d5b66d
    • H. Peter Anvin's avatar
      x86, espfix: Move espfix definitions into a separate header file · 90bfa721
      H. Peter Anvin authored
      Sparse warns that the percpu variables aren't declared before they are
      defined.  Rather than hacking around it, move espfix definitions into
      a proper header file.
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      
      (cherry picked from commit e1fe9ed8)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      90bfa721
    • H. Peter Anvin's avatar
      x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack · 7f76a341
      H. Peter Anvin authored
      The IRET instruction, when returning to a 16-bit segment, only
      restores the bottom 16 bits of the user space stack pointer.  This
      causes some 16-bit software to break, but it also leaks kernel state
      to user space.  We have a software workaround for that ("espfix") for
      the 32-bit kernel, but it relies on a nonzero stack segment base which
      is not available in 64-bit mode.
      
      In checkin:
      
          b3b42ac2 x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels
      
      we "solved" this by forbidding 16-bit segments on 64-bit kernels, with
      the logic that 16-bit support is crippled on 64-bit kernels anyway (no
      V86 support), but it turns out that people are doing stuff like
      running old Win16 binaries under Wine and expect it to work.
      
      This works around this by creating percpu "ministacks", each of which
      is mapped 2^16 times 64K apart.  When we detect that the return SS is
      on the LDT, we copy the IRET frame to the ministack and use the
      relevant alias to return to userspace.  The ministacks are mapped
      readonly, so if IRET faults we promote #GP to #DF which is an IST
      vector and thus has its own stack; we then do the fixup in the #DF
      handler.
      
      (Making #GP an IST exception would make the msr_safe functions unsafe
      in NMI/MC context, and quite possibly have other effects.)
      
      Special thanks to:
      
      - Andy Lutomirski, for the suggestion of using very small stack slots
        and copy (as opposed to map) the IRET frame there, and for the
        suggestion to mark them readonly and let the fault promote to #DF.
      - Konrad Wilk for paravirt fixup and testing.
      - Borislav Petkov for testing help and useful comments.
      Reported-by: default avatarBrian Gerst <brgerst@gmail.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Lutomriski <amluto@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Dirk Hohndel <dirk@hohndel.org>
      Cc: Arjan van de Ven <arjan.van.de.ven@intel.com>
      Cc: comex <comexk@gmail.com>
      Cc: Alexander van Heukelum <heukelum@fastmail.fm>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: <stable@vger.kernel.org> # consider after upstream merge
      
      (cherry picked from commit 3891a04a)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7f76a341
    • Lars-Peter Clausen's avatar
      iio: buffer: Fix demux table creation · 86e6d017
      Lars-Peter Clausen authored
      When creating the demux table we need to iterate over the selected scan mask for
      the buffer to get the samples which should be copied to destination buffer.
      Right now the code uses the mask which contains all active channels, which means
      the demux table contains entries which causes it to copy all the samples from
      source to destination buffer one by one without doing any demuxing.
      Signed-off-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      Cc: Stable@vger.kernel.org
      
      (cherry picked from commit 61bd55ce)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      86e6d017
    • Alexandre Bounine's avatar
      rapidio/tsi721_dma: fix failure to obtain transaction descriptor · 7760818d
      Alexandre Bounine authored
      This is a bug fix for the situation when function tsi721_desc_get() fails
      to obtain a free transaction descriptor.
      
      The bug usually results in a memory access crash dump when data transfer
      scatter-gather list has more entries than size of hardware buffer
      descriptors ring.  This fix ensures that error is properly returned to a
      caller instead of an invalid entry.
      
      This patch is applicable to kernel versions starting from v3.5.
      Signed-off-by: default avatarAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
      Cc: Stef van Os <stef.van.os@prodrive-technologies.com>
      Cc: Vinod Koul <vinod.koul@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: <stable@vger.kernel.org>	[3.5+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      
      (cherry picked from commit 0193ed82)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7760818d
    • Eliad Peller's avatar
      cfg80211: fix mic_failure tracing · e17b018d
      Eliad Peller authored
      tsc can be NULL (mac80211 currently always passes NULL),
      resulting in NULL-dereference. check before copying it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEliad Peller <eliadx.peller@intel.com>
      Signed-off-by: default avatarEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      
      (cherry picked from commit 8c26d458)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e17b018d
    • Michael Brown's avatar
      x86/efi: Include a .bss section within the PE/COFF headers · cb1f6235
      Michael Brown authored
      The PE/COFF headers currently describe only the initialised-data
      portions of the image, and result in no space being allocated for the
      uninitialised-data portions.  Consequently, the EFI boot stub will end
      up overwriting unexpected areas of memory, with unpredictable results.
      
      Fix by including a .bss section in the PE/COFF headers (functionally
      equivalent to the init_size field in the bzImage header).
      Signed-off-by: default avatarMichael Brown <mbrown@fensystems.co.uk>
      Cc: Thomas Bächler <thomas@archlinux.org>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      
      (cherry picked from commit c7fb93ec)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      cb1f6235
    • John David Anglin's avatar
      parisc: Remove SA_RESTORER define · 1ef56f20
      John David Anglin authored
      The sa_restorer field in struct sigaction is obsolete and no longer in
      the parisc implementation.  However, the core code assumes the field is
      present if SA_RESTORER is defined. So, the define needs to be removed.
      Signed-off-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      
      (cherry picked from commit 20dbea49)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1ef56f20
    • Dmitry Torokhov's avatar
      Input: fix defuzzing logic · 0880d08e
      Dmitry Torokhov authored
      We attempt to remove noise from coordinates reported by devices in
      input_handle_abs_event(), unfortunately, unless we were dropping the
      event altogether, we were ignoring the adjusted value and were passing
      on the original value instead.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarAndrew de los Reyes <adlr@chromium.org>
      Reviewed-by: default avatarBenson Leung <bleung@chromium.org>
      Reviewed-by: default avatarDavid Herrmann <dh.herrmann@gmail.com>
      Reviewed-by: default avatarHenrik Rydberg <rydberg@euromail.se>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      
      (cherry picked from commit 50c5d36d)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      0880d08e
    • Mikulas Patocka's avatar
      slab_common: fix the check for duplicate slab names · cde50abc
      Mikulas Patocka authored
      The patch 3e374919 is supposed to fix the
      problem where kmem_cache_create incorrectly reports duplicate cache name
      and fails. The problem is described in the header of that patch.
      
      However, the patch doesn't really fix the problem because of these
      reasons:
      
      * the logic to test for debugging is reversed. It was intended to perform
        the check only if slub debugging is enabled (which implies that caches
        with the same parameters are not merged). Therefore, there should be
        #if !defined(CONFIG_SLUB) || defined(CONFIG_SLUB_DEBUG_ON)
        The current code has the condition reversed and performs the test if
        debugging is disabled.
      
      * slub debugging may be enabled or disabled based on kernel command line,
        CONFIG_SLUB_DEBUG_ON is just the default settings. Therefore the test
        based on definition of CONFIG_SLUB_DEBUG_ON is unreliable.
      
      This patch fixes the problem by removing the test
      "!defined(CONFIG_SLUB_DEBUG_ON)". Therefore, duplicate names are never
      checked if the SLUB allocator is used.
      
      Note to stable kernel maintainers: when backporint this patch, please
      backport also the patch 3e374919.
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org	# 3.6+
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      
      (cherry picked from commit 69461747)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      cde50abc
    • Christoph Lameter's avatar
      slab_common: Do not check for duplicate slab names · 890c7afd
      Christoph Lameter authored
      SLUB can alias multiple slab kmem_create_requests to one slab cache to save
      memory and increase the cache hotness. As a result the name of the slab can be
      stale. Only check the name for duplicates if we are in debug mode where we do
      not merge multiple caches.
      
      This fixes the following problem reported by Jonathan Brassow:
      
        The problem with kmem_cache* is this:
      
        *) Assume CONFIG_SLUB is set
        1) kmem_cache_create(name="foo-a")
        - creates new kmem_cache structure
        2) kmem_cache_create(name="foo-b")
        - If identical cache characteristics, it will be merged with the previously
          created cache associated with "foo-a".  The cache's refcount will be
          incremented and an alias will be created via sysfs_slab_alias().
        3) kmem_cache_destroy(<ptr>)
        - Attempting to destroy cache associated with "foo-a", but instead the
          refcount is simply decremented.  I don't even think the sysfs aliases are
          ever removed...
        4) kmem_cache_create(name="foo-a")
        - This FAILS because kmem_cache_sanity_check colides with the existing
          name ("foo-a") associated with the non-removed cache.
      
        This is a problem for RAID (specifically dm-raid) because the name used
        for the kmem_cache_create is ("raid%d-%p", level, mddev).  If the cache
        persists for long enough, the memory address of an old mddev will be
        reused for a new mddev - causing an identical formulation of the cache
        name.  Even though kmem_cache_destory had long ago been used to delete
        the old cache, the merging of caches has cause the name and cache of that
        old instance to be preserved and causes a colision (and thus failure) in
        kmem_cache_create().  I see this regularly in my testing.
      Reported-by: default avatarJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      
      (cherry picked from commit 3e374919)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      890c7afd