1. 19 May, 2016 5 commits
    • Nikolay Aleksandrov's avatar
      net: bridge: fix old ioctl unlocked net device walk · cf5ae3f3
      Nikolay Aleksandrov authored
      [ Upstream commit 31ca0458 ]
      
      get_bridge_ifindices() is used from the old "deviceless" bridge ioctl
      calls which aren't called with rtnl held. The comment above says that it is
      called with rtnl but that is not really the case.
      Here's a sample output from a test ASSERT_RTNL() which I put in
      get_bridge_ifindices and executed "brctl show":
      [  957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30)
      [  957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G        W  O
      4.6.0-rc4+ #157
      [  957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      BIOS 1.8.1-20150318_183358- 04/01/2014
      [  957.423009]  0000000000000000 ffff880058adfdf0 ffffffff8138dec5
      0000000000000400
      [  957.423009]  ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32
      0000000000000001
      [  957.423009]  00007ffec1a444b0 0000000000000400 ffff880053c19130
      0000000000008940
      [  957.423009] Call Trace:
      [  957.423009]  [<ffffffff8138dec5>] dump_stack+0x85/0xc0
      [  957.423009]  [<ffffffffa05ead32>]
      br_ioctl_deviceless_stub+0x212/0x2e0 [bridge]
      [  957.423009]  [<ffffffff81515beb>] sock_ioctl+0x22b/0x290
      [  957.423009]  [<ffffffff8126ba75>] do_vfs_ioctl+0x95/0x700
      [  957.423009]  [<ffffffff8126c159>] SyS_ioctl+0x79/0x90
      [  957.423009]  [<ffffffff8163a4c0>] entry_SYSCALL_64_fastpath+0x23/0xc1
      
      Since it only reads bridge ifindices, we can use rcu to safely walk the net
      device list. Also remove the wrong rtnl comment above.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      cf5ae3f3
    • Ian Campbell's avatar
      VSOCK: do not disconnect socket when peer has shutdown SEND only · fa9ba455
      Ian Campbell authored
      [ Upstream commit dedc58e0 ]
      
      The peer may be expecting a reply having sent a request and then done a
      shutdown(SHUT_WR), so tearing down the whole socket at this point seems
      wrong and breaks for me with a client which does a SHUT_WR.
      
      Looking at other socket family's stream_recvmsg callbacks doing a shutdown
      here does not seem to be the norm and removing it does not seem to have
      had any adverse effects that I can see.
      
      I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact
      on the vmci transport.
      Signed-off-by: default avatarIan Campbell <ian.campbell@docker.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
      Cc: Andy King <acking@vmware.com>
      Cc: Dmitry Torokhov <dtor@vmware.com>
      Cc: Jorgen Hansen <jhansen@vmware.com>
      Cc: Adit Ranadive <aditr@vmware.com>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fa9ba455
    • Kangjie Lu's avatar
      net: fix infoleak in rtnetlink · 3248734d
      Kangjie Lu authored
      [ Upstream commit 5f8e4474 ]
      
      The stack object “map” has a total size of 32 bytes. Its last 4
      bytes are padding generated by compiler. These padding bytes are
      not initialized and sent out via “nla_put”.
      Signed-off-by: default avatarKangjie Lu <kjlu@gatech.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3248734d
    • Kangjie Lu's avatar
      net: fix infoleak in llc · 734b9658
      Kangjie Lu authored
      [ Upstream commit b8670c09 ]
      
      The stack object “info” has a total size of 12 bytes. Its last byte
      is padding which is not initialized and leaked via “put_cmsg”.
      Signed-off-by: default avatarKangjie Lu <kjlu@gatech.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      734b9658
    • Neil Horman's avatar
      netem: Segment GSO packets on enqueue · 521a3b41
      Neil Horman authored
      [ Upstream commit 6071bd1a ]
      
      This was recently reported to me, and reproduced on the latest net kernel,
      when attempting to run netperf from a host that had a netem qdisc attached
      to the egress interface:
      
      [  788.073771] ---------------------[ cut here ]---------------------------
      [  788.096716] WARNING: at net/core/dev.c:2253 skb_warn_bad_offload+0xcd/0xda()
      [  788.129521] bnx2: caps=(0x00000001801949b3, 0x0000000000000000) len=2962
      data_len=0 gso_size=1448 gso_type=1 ip_summed=3
      [  788.182150] Modules linked in: sch_netem kvm_amd kvm crc32_pclmul ipmi_ssif
      ghash_clmulni_intel sp5100_tco amd64_edac_mod aesni_intel lrw gf128mul
      glue_helper ablk_helper edac_mce_amd cryptd pcspkr sg edac_core hpilo ipmi_si
      i2c_piix4 k10temp fam15h_power hpwdt ipmi_msghandler shpchp acpi_power_meter
      pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c
      sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt
      i2c_algo_bit drm_kms_helper ahci ata_generic pata_acpi ttm libahci
      crct10dif_pclmul pata_atiixp tg3 libata crct10dif_common drm crc32c_intel ptp
      serio_raw bnx2 r8169 hpsa pps_core i2c_core mii dm_mirror dm_region_hash dm_log
      dm_mod
      [  788.465294] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G        W
      ------------   3.10.0-327.el7.x86_64 #1
      [  788.511521] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 12/17/2012
      [  788.542260]  ffff880437c036b8 f7afc56532a53db9 ffff880437c03670
      ffffffff816351f1
      [  788.576332]  ffff880437c036a8 ffffffff8107b200 ffff880633e74200
      ffff880231674000
      [  788.611943]  0000000000000001 0000000000000003 0000000000000000
      ffff880437c03710
      [  788.647241] Call Trace:
      [  788.658817]  <IRQ>  [<ffffffff816351f1>] dump_stack+0x19/0x1b
      [  788.686193]  [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
      [  788.713803]  [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
      [  788.741314]  [<ffffffff812f92f3>] ? ___ratelimit+0x93/0x100
      [  788.767018]  [<ffffffff81637f49>] skb_warn_bad_offload+0xcd/0xda
      [  788.796117]  [<ffffffff8152950c>] skb_checksum_help+0x17c/0x190
      [  788.823392]  [<ffffffffa01463a1>] netem_enqueue+0x741/0x7c0 [sch_netem]
      [  788.854487]  [<ffffffff8152cb58>] dev_queue_xmit+0x2a8/0x570
      [  788.880870]  [<ffffffff8156ae1d>] ip_finish_output+0x53d/0x7d0
      ...
      
      The problem occurs because netem is not prepared to handle GSO packets (as it
      uses skb_checksum_help in its enqueue path, which cannot manipulate these
      frames).
      
      The solution I think is to simply segment the skb in a simmilar fashion to the
      way we do in __dev_queue_xmit (via validate_xmit_skb), with some minor changes.
      When we decide to corrupt an skb, if the frame is GSO, we segment it, corrupt
      the first segment, and enqueue the remaining ones.
      
      tested successfully by myself on the latest net kernel, to which this applies
      
      [js] backport to 3.12: no qdisc_qstats_drop yet, update directly. Also use
           qdisc_tree_decrease_qlen instead of qdisc_tree_reduce_backlog.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      CC: Jamal Hadi Salim <jhs@mojatatu.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: netem@lists.linux-foundation.org
      CC: eric.dumazet@gmail.com
      CC: stephen@networkplumber.org
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      521a3b41
  2. 18 May, 2016 10 commits
  3. 16 May, 2016 4 commits
  4. 13 May, 2016 2 commits
    • Konstantin Khlebnikov's avatar
      mm/balloon_compaction: fix deflation when compaction is disabled · 1f649733
      Konstantin Khlebnikov authored
      commit 4d88e6f7 upstream.
      
      If CONFIG_BALLOON_COMPACTION=n balloon_page_insert() does not link pages
      with balloon and doesn't set PagePrivate flag, as a result
      balloon_page_dequeue() cannot get any pages because it thinks that all
      of them are isolated.  Without balloon compaction nobody can isolate
      ballooned pages.  It's safe to remove this check.
      
      Fixes: d6d86c0a ("mm/balloon_compaction: redesign ballooned pages management").
      Signed-off-by: default avatarKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Reported-by: default avatarMatt Mullins <mmullins@mmlx.us>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Gavin Guo <gavin.guo@canonical.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1f649733
    • Konstantin Khlebnikov's avatar
      mm/balloon_compaction: redesign ballooned pages management · 33904d89
      Konstantin Khlebnikov authored
      commit d6d86c0a upstream.
      
      Sasha Levin reported KASAN splash inside isolate_migratepages_range().
      Problem is in the function __is_movable_balloon_page() which tests
      AS_BALLOON_MAP in page->mapping->flags.  This function has no protection
      against anonymous pages.  As result it tried to check address space flags
      inside struct anon_vma.
      
      Further investigation shows more problems in current implementation:
      
      * Special branch in __unmap_and_move() never works:
        balloon_page_movable() checks page flags and page_count.  In
        __unmap_and_move() page is locked, reference counter is elevated, thus
        balloon_page_movable() always fails.  As a result execution goes to the
        normal migration path.  virtballoon_migratepage() returns
        MIGRATEPAGE_BALLOON_SUCCESS instead of MIGRATEPAGE_SUCCESS,
        move_to_new_page() thinks this is an error code and assigns
        newpage->mapping to NULL.  Newly migrated page lose connectivity with
        balloon an all ability for further migration.
      
      * lru_lock erroneously required in isolate_migratepages_range() for
        isolation ballooned page.  This function releases lru_lock periodically,
        this makes migration mostly impossible for some pages.
      
      * balloon_page_dequeue have a tight race with balloon_page_isolate:
        balloon_page_isolate could be executed in parallel with dequeue between
        picking page from list and locking page_lock.  Race is rare because they
        use trylock_page() for locking.
      
      This patch fixes all of them.
      
      Instead of fake mapping with special flag this patch uses special state of
      page->_mapcount: PAGE_BALLOON_MAPCOUNT_VALUE = -256.  Buddy allocator uses
      PAGE_BUDDY_MAPCOUNT_VALUE = -128 for similar purpose.  Storing mark
      directly in struct page makes everything safer and easier.
      
      PagePrivate is used to mark pages present in page list (i.e.  not
      isolated, like PageLRU for normal pages).  It replaces special rules for
      reference counter and makes balloon migration similar to migration of
      normal pages.  This flag is protected by page_lock together with link to
      the balloon device.
      
      [js] backport to 3.12. MIGRATEPAGE_BALLOON_SUCCESS had to be removed
           from one more place. VM_BUG_ON_PAGE does not exist in 3.12 yet,
           use plain VM_BUG_ON.
      Signed-off-by: default avatarKonstantin Khlebnikov <k.khlebnikov@samsung.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Link: http://lkml.kernel.org/p/53E6CEAA.9020105@oracle.com
      Cc: Rafael Aquini <aquini@redhat.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Gavin Guo <gavin.guo@canonical.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      33904d89
  5. 11 May, 2016 19 commits