1. 22 Mar, 2016 31 commits
  2. 15 Mar, 2016 2 commits
  3. 14 Mar, 2016 1 commit
  4. 10 Mar, 2016 6 commits
    • Willy Tarreau's avatar
      pipe: limit the per-user amount of pages allocated in pipes · ac0d50a3
      Willy Tarreau authored
      commit 759c0114 upstream.
      
      On no-so-small systems, it is possible for a single process to cause an
      OOM condition by filling large pipes with data that are never read. A
      typical process filling 4000 pipes with 1 MB of data will use 4 GB of
      memory. On small systems it may be tricky to set the pipe max size to
      prevent this from happening.
      
      This patch makes it possible to enforce a per-user soft limit above
      which new pipes will be limited to a single page, effectively limiting
      them to 4 kB each, as well as a hard limit above which no new pipes may
      be created for this user. This has the effect of protecting the system
      against memory abuse without hurting other users, and still allowing
      pipes to work correctly though with less data at once.
      
      The limit are controlled by two new sysctls : pipe-user-pages-soft, and
      pipe-user-pages-hard. Both may be disabled by setting them to zero. The
      default soft limit allows the default number of FDs per process (1024)
      to create pipes of the default size (64kB), thus reaching a limit of 64MB
      before starting to create only smaller pipes. With 256 processes limited
      to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
      1084 MB of memory allocated for a user. The hard limit is disabled by
      default to avoid breaking existing applications that make intensive use
      of pipes (eg: for splicing).
      
      Reported-by: socketpair@gmail.com
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Mitigates: CVE-2013-4312 (Linux 2.0+)
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ac0d50a3
    • Rainer Weikusat's avatar
      af_unix: Don't set err in unix_stream_read_generic unless there was an error · d4425f1f
      Rainer Weikusat authored
      commit 1b92ee3d upstream.
      
      The present unix_stream_read_generic contains various code sequences of
      the form
      
      err = -EDISASTER;
      if (<test>)
      	goto out;
      
      This has the unfortunate side effect of possibly causing the error code
      to bleed through to the final
      
      out:
      	return copied ? : err;
      
      and then to be wrongly returned if no data was copied because the caller
      didn't supply a data buffer, as demonstrated by the program available at
      
      http://pad.lv/1540731
      
      Change it such that err is only set if an error condition was detected.
      
      Fixes: 3822b5c2 ("af_unix: Revert 'lock_interruptible' in stream receive code")
      Reported-by: default avatarJoseph Salisbury <joseph.salisbury@canonical.com>
      Signed-off-by: default avatarRainer Weikusat <rweikusat@mobileactivedefense.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [ luis: backported to 3.16:
        - modify unix_stream_recvmsg() instead of unix_stream_read_generic()
        - adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d4425f1f
    • Eugenia Emantayev's avatar
      net/mlx4_en: Choose time-stamping shift value according to HW frequency · caf7698e
      Eugenia Emantayev authored
      commit 31c128b6 upstream.
      
      Previously, the shift value used for time-stamping was constant and didn't
      depend on the HW chip frequency. Change that to take the frequency into account
      and calculate the maximal value in cycles per wraparound of ten seconds. This
      time slot was chosen since it gives a good accuracy in time synchronization.
      
      Algorithm for shift value calculation:
       * Round up the maximal value in cycles to nearest power of two
      
       * Calculate maximal multiplier by division of all 64 bits set
         to above result
      
       * Then, invert the function clocksource_khz2mult() to get the shift from
         maximal mult value
      
      Fixes: ec693d47 ('net/mlx4_en: Add HW timestamping (TS) support')
      Signed-off-by: default avatarEugenia Emantayev <eugenia@mellanox.com>
      Reviewed-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      caf7698e
    • Eric Dumazet's avatar
      ipv4: fix memory leaks in ip_cmsg_send() callers · 372067d3
      Eric Dumazet authored
      commit 91948309 upstream.
      
      Dmitry reported memory leaks of IP options allocated in
      ip_cmsg_send() when/if this function returns an error.
      
      Callers are responsible for the freeing.
      
      Many thanks to Dmitry for the report and diagnostic.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      372067d3
    • Jay Vosburgh's avatar
      bonding: Fix ARP monitor validation · 7d07524d
      Jay Vosburgh authored
      commit 21a75f09 upstream.
      
      The current logic in bond_arp_rcv will accept an incoming ARP for
      validation if (a) the receiving slave is either "active" (which includes
      the currently active slave, or the current ARP slave) or, (b) there is a
      currently active slave, and it has received an ARP since it became active.
      For case (b), the receiving slave isn't the currently active slave, and is
      receiving the original broadcast ARP request, not an ARP reply from the
      target.
      
      	This logic can fail if there is no currently active slave.  In
      this situation, the ARP probe logic cycles through all slaves, assigning
      each in turn as the "current_arp_slave" for one arp_interval, then setting
      that one as "active," and sending an ARP probe from that slave.  The
      current logic expects the ARP reply to arrive on the sending
      current_arp_slave, however, due to switch FDB updating delays, the reply
      may be directed to another slave.
      
      	This can arise if the bonding slaves and switch are working, but
      the ARP target is not responding.  When the ARP target recovers, a
      condition may result wherein the ARP target host replies faster than the
      switch can update its forwarding table, causing each ARP reply to be sent
      to the previous current_arp_slave.  This will never pass the logic in
      bond_arp_rcv, as neither of the above conditions (a) or (b) are met.
      
      	Some experimentation on a LAN shows ARP reply round trips in the
      200 usec range, but my available switches never update their FDB in less
      than 4000 usec.
      
      	This patch changes the logic in bond_arp_rcv to additionally
      accept an ARP reply for validation on any slave if there is a current ARP
      slave and it sent an ARP probe during the previous arp_interval.
      
      Fixes: aeea64ac ("bonding: don't trust arp requests unless active slave really works")
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      [ kamal: backported to 3.13: adjusted context ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7d07524d
    • Neil Horman's avatar
      sctp: Fix port hash table size computation · 87042cdc
      Neil Horman authored
      commit d9749fb5 upstream.
      
      Dmitry Vyukov noted recently that the sctp_port_hashtable had an error in
      its size computation, observing that the current method never guaranteed
      that the hashsize (measured in number of entries) would be a power of two,
      which the input hash function for that table requires.  The root cause of
      the problem is that two values need to be computed (one, the allocation
      order of the storage requries, as passed to __get_free_pages, and two the
      number of entries for the hash table).  Both need to be ^2, but for
      different reasons, and the existing code is simply computing one order
      value, and using it as the basis for both, which is wrong (i.e. it assumes
      that ((1<<order)*PAGE_SIZE)/sizeof(bucket) is still ^2 when its not).
      
      To fix this, we change the logic slightly.  We start by computing a goal
      allocation order (which is limited by the maximum size hash table we want
      to support.  Then we attempt to allocate that size table, decreasing the
      order until a successful allocation is made.  Then, with the resultant
      successful order we compute the number of buckets that hash table supports,
      which we then round down to the nearest power of two, giving us the number
      of entries the table actually supports.
      
      I've tested this locally here, using non-debug and spinlock-debug kernels,
      and the number of entries in the hashtable consistently work out to be
      powers of two in all cases.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      CC: Dmitry Vyukov <dvyukov@google.com>
      CC: Vladislav Yasevich <vyasevich@gmail.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      87042cdc