1. 19 May, 2016 1 commit
    • Neil Horman's avatar
      netem: Segment GSO packets on enqueue · 521a3b41
      Neil Horman authored
      [ Upstream commit 6071bd1a ]
      
      This was recently reported to me, and reproduced on the latest net kernel,
      when attempting to run netperf from a host that had a netem qdisc attached
      to the egress interface:
      
      [  788.073771] ---------------------[ cut here ]---------------------------
      [  788.096716] WARNING: at net/core/dev.c:2253 skb_warn_bad_offload+0xcd/0xda()
      [  788.129521] bnx2: caps=(0x00000001801949b3, 0x0000000000000000) len=2962
      data_len=0 gso_size=1448 gso_type=1 ip_summed=3
      [  788.182150] Modules linked in: sch_netem kvm_amd kvm crc32_pclmul ipmi_ssif
      ghash_clmulni_intel sp5100_tco amd64_edac_mod aesni_intel lrw gf128mul
      glue_helper ablk_helper edac_mce_amd cryptd pcspkr sg edac_core hpilo ipmi_si
      i2c_piix4 k10temp fam15h_power hpwdt ipmi_msghandler shpchp acpi_power_meter
      pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c
      sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt
      i2c_algo_bit drm_kms_helper ahci ata_generic pata_acpi ttm libahci
      crct10dif_pclmul pata_atiixp tg3 libata crct10dif_common drm crc32c_intel ptp
      serio_raw bnx2 r8169 hpsa pps_core i2c_core mii dm_mirror dm_region_hash dm_log
      dm_mod
      [  788.465294] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G        W
      ------------   3.10.0-327.el7.x86_64 #1
      [  788.511521] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 12/17/2012
      [  788.542260]  ffff880437c036b8 f7afc56532a53db9 ffff880437c03670
      ffffffff816351f1
      [  788.576332]  ffff880437c036a8 ffffffff8107b200 ffff880633e74200
      ffff880231674000
      [  788.611943]  0000000000000001 0000000000000003 0000000000000000
      ffff880437c03710
      [  788.647241] Call Trace:
      [  788.658817]  <IRQ>  [<ffffffff816351f1>] dump_stack+0x19/0x1b
      [  788.686193]  [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
      [  788.713803]  [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
      [  788.741314]  [<ffffffff812f92f3>] ? ___ratelimit+0x93/0x100
      [  788.767018]  [<ffffffff81637f49>] skb_warn_bad_offload+0xcd/0xda
      [  788.796117]  [<ffffffff8152950c>] skb_checksum_help+0x17c/0x190
      [  788.823392]  [<ffffffffa01463a1>] netem_enqueue+0x741/0x7c0 [sch_netem]
      [  788.854487]  [<ffffffff8152cb58>] dev_queue_xmit+0x2a8/0x570
      [  788.880870]  [<ffffffff8156ae1d>] ip_finish_output+0x53d/0x7d0
      ...
      
      The problem occurs because netem is not prepared to handle GSO packets (as it
      uses skb_checksum_help in its enqueue path, which cannot manipulate these
      frames).
      
      The solution I think is to simply segment the skb in a simmilar fashion to the
      way we do in __dev_queue_xmit (via validate_xmit_skb), with some minor changes.
      When we decide to corrupt an skb, if the frame is GSO, we segment it, corrupt
      the first segment, and enqueue the remaining ones.
      
      tested successfully by myself on the latest net kernel, to which this applies
      
      [js] backport to 3.12: no qdisc_qstats_drop yet, update directly. Also use
           qdisc_tree_decrease_qlen instead of qdisc_tree_reduce_backlog.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      CC: Jamal Hadi Salim <jhs@mojatatu.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: netem@lists.linux-foundation.org
      CC: eric.dumazet@gmail.com
      CC: stephen@networkplumber.org
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      521a3b41
  2. 18 May, 2016 10 commits
  3. 16 May, 2016 4 commits
  4. 13 May, 2016 2 commits
    • Konstantin Khlebnikov's avatar
      mm/balloon_compaction: fix deflation when compaction is disabled · 1f649733
      Konstantin Khlebnikov authored
      commit 4d88e6f7 upstream.
      
      If CONFIG_BALLOON_COMPACTION=n balloon_page_insert() does not link pages
      with balloon and doesn't set PagePrivate flag, as a result
      balloon_page_dequeue() cannot get any pages because it thinks that all
      of them are isolated.  Without balloon compaction nobody can isolate
      ballooned pages.  It's safe to remove this check.
      
      Fixes: d6d86c0a ("mm/balloon_compaction: redesign ballooned pages management").
      Signed-off-by: default avatarKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Reported-by: default avatarMatt Mullins <mmullins@mmlx.us>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Gavin Guo <gavin.guo@canonical.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1f649733
    • Konstantin Khlebnikov's avatar
      mm/balloon_compaction: redesign ballooned pages management · 33904d89
      Konstantin Khlebnikov authored
      commit d6d86c0a upstream.
      
      Sasha Levin reported KASAN splash inside isolate_migratepages_range().
      Problem is in the function __is_movable_balloon_page() which tests
      AS_BALLOON_MAP in page->mapping->flags.  This function has no protection
      against anonymous pages.  As result it tried to check address space flags
      inside struct anon_vma.
      
      Further investigation shows more problems in current implementation:
      
      * Special branch in __unmap_and_move() never works:
        balloon_page_movable() checks page flags and page_count.  In
        __unmap_and_move() page is locked, reference counter is elevated, thus
        balloon_page_movable() always fails.  As a result execution goes to the
        normal migration path.  virtballoon_migratepage() returns
        MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
        move_to_new_page() thinks this is an error code and assigns
        newpage->mapping to NULL.  Newly migrated page lose connectivity with
        balloon an all ability for further migration.
      
      * lru_lock erroneously required in isolate_migratepages_range() for
        isolation ballooned page.  This function releases lru_lock periodically,
        this makes migration mostly impossible for some pages.
      
      * balloon_page_dequeue have a tight race with balloon_page_isolate:
        balloon_page_isolate could be executed in parallel with dequeue between
        picking page from list and locking page_lock.  Race is rare because they
        use trylock_page() for locking.
      
      This patch fixes all of them.
      
      Instead of fake mapping with special flag this patch uses special state of
      page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256.  Buddy allocator uses
      PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose.  Storing mark
      directly in struct page makes everything safer and easier.
      
      PagePrivate is used to mark pages present in page list (i.e.  not
      isolated, like PageLRU for normal pages).  It replaces special rules for
      reference counter and makes balloon migration similar to migration of
      normal pages.  This flag is protected by page_lock together with link to
      the balloon device.
      
      [js] backport to 3.12. MIGRATEPAGE_BALLOON_SUCCESS had to be removed
           from one more place. VM_BUG_ON_PAGE does not exist in 3.12 yet,
           use plain VM_BUG_ON.
      Signed-off-by: default avatarKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Link: http://lkml.kernel.org/p/53E6CEAA.9020105@oracle.com
      Cc: Rafael Aquini <aquini@redhat.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Gavin Guo <gavin.guo@canonical.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      33904d89
  5. 11 May, 2016 20 commits
  6. 03 May, 2016 3 commits
    • John Stultz's avatar
      cpuset: Fix potential deadlock w/ set_mems_allowed · d9aa9f58
      John Stultz authored
      commit db751fe3 upstream.
      
      After adding lockdep support to seqlock/seqcount structures,
      I started seeing the following warning:
      
      [    1.070907] ======================================================
      [    1.072015] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
      [    1.073181] 3.11.0+ #67 Not tainted
      [    1.073801] ------------------------------------------------------
      [    1.074882] kworker/u4:2/708 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
      [    1.076088]  (&p->mems_allowed_seq){+.+...}, at: [<ffffffff81187d7f>] new_slab+0x5f/0x280
      [    1.077572]
      [    1.077572] and this task is already holding:
      [    1.078593]  (&(&q->__queue_lock)->rlock){..-...}, at: [<ffffffff81339f03>] blk_execute_rq_nowait+0x53/0xf0
      [    1.080042] which would create a new lock dependency:
      [    1.080042]  (&(&q->__queue_lock)->rlock){..-...} -> (&p->mems_allowed_seq){+.+...}
      [    1.080042]
      [    1.080042] but this new dependency connects a SOFTIRQ-irq-safe lock:
      [    1.080042]  (&(&q->__queue_lock)->rlock){..-...}
      [    1.080042] ... which became SOFTIRQ-irq-safe at:
      [    1.080042]   [<ffffffff810ec179>] __lock_acquire+0x5b9/0x1db0
      [    1.080042]   [<ffffffff810edfe5>] lock_acquire+0x95/0x130
      [    1.080042]   [<ffffffff818968a1>] _raw_spin_lock+0x41/0x80
      [    1.080042]   [<ffffffff81560c9e>] scsi_device_unbusy+0x7e/0xd0
      [    1.080042]   [<ffffffff8155a612>] scsi_finish_command+0x32/0xf0
      [    1.080042]   [<ffffffff81560e91>] scsi_softirq_done+0xa1/0x130
      [    1.080042]   [<ffffffff8133b0f3>] blk_done_softirq+0x73/0x90
      [    1.080042]   [<ffffffff81095dc0>] __do_softirq+0x110/0x2f0
      [    1.080042]   [<ffffffff81095fcd>] run_ksoftirqd+0x2d/0x60
      [    1.080042]   [<ffffffff810bc506>] smpboot_thread_fn+0x156/0x1e0
      [    1.080042]   [<ffffffff810b3916>] kthread+0xd6/0xe0
      [    1.080042]   [<ffffffff818980ac>] ret_from_fork+0x7c/0xb0
      [    1.080042]
      [    1.080042] to a SOFTIRQ-irq-unsafe lock:
      [    1.080042]  (&p->mems_allowed_seq){+.+...}
      [    1.080042] ... which became SOFTIRQ-irq-unsafe at:
      [    1.080042] ...  [<ffffffff810ec1d3>] __lock_acquire+0x613/0x1db0
      [    1.080042]   [<ffffffff810edfe5>] lock_acquire+0x95/0x130
      [    1.080042]   [<ffffffff810b3df2>] kthreadd+0x82/0x180
      [    1.080042]   [<ffffffff818980ac>] ret_from_fork+0x7c/0xb0
      [    1.080042]
      [    1.080042] other info that might help us debug this:
      [    1.080042]
      [    1.080042]  Possible interrupt unsafe locking scenario:
      [    1.080042]
      [    1.080042]        CPU0                    CPU1
      [    1.080042]        ----                    ----
      [    1.080042]   lock(&p->mems_allowed_seq);
      [    1.080042]                                local_irq_disable();
      [    1.080042]                                lock(&(&q->__queue_lock)->rlock);
      [    1.080042]                                lock(&p->mems_allowed_seq);
      [    1.080042]   <Interrupt>
      [    1.080042]     lock(&(&q->__queue_lock)->rlock);
      [    1.080042]
      [    1.080042]  *** DEADLOCK ***
      
      The issue stems from the kthreadd() function calling set_mems_allowed
      with irqs enabled. While its possibly unlikely for the actual deadlock
      to trigger, a fix is fairly simple: disable irqs before taking the
      mems_allowed_seq lock.
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Link: http://lkml.kernel.org/r/1381186321-4906-4-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d9aa9f58
    • Ewan D. Milne's avatar
      scsi: Avoid crashing if device uses DIX but adapter does not support it · afa099d3
      Ewan D. Milne authored
      commit 91724c20 upstream.
      
      This can happen if a multipathed device uses DIX and another path is
      added via an adapter that does not support it.  Multipath should not
      allow this path to be added, but we should not depend upon that to avoid
      crashing.
      Signed-off-by: default avatarEwan D. Milne <emilne@redhat.com>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      afa099d3
    • Adrian Hunter's avatar
      mmc: sdhci: Allow for irq being shared · d9a22915
      Adrian Hunter authored
      commit 655bca76 upstream.
      
      If the SDHCI irq is shared with another device then the interrupt
      handler can get called while SDHCI is runtime suspended.  That is
      harmless but the warning message is not useful so remove it.  Also
      returning IRQ_NONE is more appropriate.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarChris Ball <chris@printf.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d9a22915