1. 19 Nov, 2013 7 commits
    • Johannes Berg's avatar
      quota/genetlink: use proper genetlink multicast APIs · 2ecf7536
      Johannes Berg authored
      The quota code is abusing the genetlink API and is using
      its family ID as the multicast group ID, which is invalid
      and may belong to somebody else (and likely will.)
      
      Make the quota code use the correct API, but since this
      is already used as-is by userspace, reserve a family ID
      for this code and also reserve that group ID to not break
      userspace assumptions.
      Acked-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ecf7536
    • Johannes Berg's avatar
      drop_monitor/genetlink: use proper genetlink multicast APIs · e5dcecba
      Johannes Berg authored
      The drop monitor code is abusing the genetlink API and is
      statically using the generic netlink multicast group 1, even
      if that group belongs to somebody else (which it invariably
      will, since it's not reserved.)
      
      Make the drop monitor code use the proper APIs to reserve a
      group ID, but also reserve the group id 1 in generic netlink
      code to preserve the userspace API. Since drop monitor can
      be a module, don't clear the bit for it on unregistration.
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5dcecba
    • Johannes Berg's avatar
      genetlink: only pass array to genl_register_family_with_ops() · c53ed742
      Johannes Berg authored
      As suggested by David Miller, make genl_register_family_with_ops()
      a macro and pass only the array, evaluating ARRAY_SIZE() in the
      macro, this is a little safer.
      
      The openvswitch has some indirection, assing ops/n_ops directly in
      that code. This might ultimately just assign the pointers in the
      family initializations, saving the struct genl_family_and_ops and
      code (once mcast groups are handled differently.)
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c53ed742
    • Andrey Vagin's avatar
      tcp: don't update snd_nxt, when a socket is switched from repair mode · dbde4979
      Andrey Vagin authored
      snd_nxt must be updated synchronously with sk_send_head.  Otherwise
      tp->packets_out may be updated incorrectly, what may bring a kernel panic.
      
      Here is a kernel panic from my host.
      [  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
      [  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
      ...
      [  146.301158] Call Trace:
      [  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0
      
      Before this panic a tcp socket was restored. This socket had sent and
      unsent data in the write queue. Sent data was restored in repair mode,
      then the socket was switched from reapair mode and unsent data was
      restored. After that the socket was switched back into repair mode.
      
      In that moment we had a socket where write queue looks like this:
      snd_una    snd_nxt   write_seq
         |_________|________|
                   |
      	  sk_send_head
      
      After a second switching from repair mode the state of socket was
      changed:
      
      snd_una          snd_nxt, write_seq
         |_________ ________|
                   |
      	  sk_send_head
      
      This state is inconsistent, because snd_nxt and sk_send_head are not
      synchronized.
      
      Bellow you can find a call trace, how packets_out can be incremented
      twice for one skb, if snd_nxt and sk_send_head are not synchronized.
      In this case packets_out will be always positive, even when
      sk_write_queue is empty.
      
      tcp_write_wakeup
      	skb = tcp_send_head(sk);
      	tcp_fragment
      		if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
      			tcp_adjust_pcount(sk, skb, diff);
      	tcp_event_new_data_sent
      		tp->packets_out += tcp_skb_pcount(skb);
      
      I think update of snd_nxt isn't required, when a socket is switched from
      repair mode.  Because it's initialized in tcp_connect_init. Then when a
      write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
      so it's always is in consistent state.
      
      I have checked, that the bug is not reproduced with this patch and
      all tests about restoring tcp connections work fine.
      
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: James Morris <jmorris@namei.org>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Acked-by: default avatarPavel Emelyanov <xemul@parallels.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dbde4979
    • Ying Xue's avatar
      atm: idt77252: fix dev refcnt leak · b5de4a22
      Ying Xue authored
      init_card() calls dev_get_by_name() to get a network deceive. But it
      doesn't decrease network device reference count after the device is
      used.
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5de4a22
    • fan.du's avatar
      xfrm: Release dst if this dst is improper for vti tunnel · 236c9f84
      fan.du authored
      After searching rt by the vti tunnel dst/src parameter,
      if this rt has neither attached to any transformation
      nor the transformation is not tunnel oriented, this rt
      should be released back to ip layer.
      
      otherwise causing dst memory leakage.
      Signed-off-by: default avatarFan Du <fan.du@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      236c9f84
    • Johannes Berg's avatar
      netlink: fix documentation typo in netlink_set_err() · 840e93f2
      Johannes Berg authored
      The parameter is just 'group', not 'groups', fix the documentation typo.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      840e93f2
  2. 18 Nov, 2013 14 commits
  3. 16 Nov, 2013 5 commits
  4. 15 Nov, 2013 7 commits
    • Daniel Mack's avatar
      net: ethernet: ti/cpsw: do not crash on single-MAC machines during resume · 1e7a2e21
      Daniel Mack authored
      During resume, use for_each_slave to walk the slaves of the cpsw, and
      soft-reset each of them. This prevents oopses if there is only one
      slave configured.
      Signed-off-by: default avatarDaniel Mack <zonque@gmail.com>
      Acked-by: default avatarMugunthan V N <mugunthanvnm@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e7a2e21
    • David S. Miller's avatar
      Merge branch 'macvlan' · 82c80e9d
      David S. Miller authored
      Michal Kubecek says:
      
      ====================
      macvlan: disable LRO on lowerdev instead of a macvlan
      
      A customer of ours encountered a problem with LRO on an ixgbe network
      card. Analysis showed that it was a known conflict of forwarding and LRO
      but the forwarding was enabled in an LXC container where only a macvlan
      was, not the ethernet device itself.
      
      I believe the solution is exactly the same as what we do for "normal"
      (802.1q) VLAN devices: if dev_disable_lro() is called for such device,
      LRO is disabled on the underlying "real" device instead.
      
      v2: adapt to changes merged from net-next
      
      v3: use BUG() in macvlan_dev_real_dev() if compiled without macvlan
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      82c80e9d
    • Michal Kubeček's avatar
      macvlan: disable LRO on lower device instead of macvlan · 529d0489
      Michal Kubeček authored
      A macvlan device has always LRO disabled so that calling
      dev_disable_lro() on it does nothing. If we need to disable LRO
      e.g. because
      
        - the macvlan device is inserted into a bridge
        - IPv6 forwarding is enabled for it
        - it is in a different namespace than lowerdev and IPv4
          forwarding is enabled in it
      
      we need to disable LRO on its underlying device instead (as we
      do for 802.1q VLAN devices).
      
      v2: use newly introduced netif_is_macvlan()
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      529d0489
    • Michal Kubeček's avatar
      macvlan: introduce macvlan_dev_real_dev() helper function · be9eac48
      Michal Kubeček authored
      Introduce helper function macvlan_dev_real_dev which returns the
      underlying device of a macvlan device, similar to vlan_dev_real_dev()
      for 802.1q VLAN devices.
      
      v2: IFF_MACVLAN flag and equivalent of is_macvlan_dev() were
      introduced in the meantime
      
      v3: do BUG() if compiled without macvlan support
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      be9eac48
    • Wang Weidong's avatar
      bonding: add ip checks when store ip target · f9de11a1
      Wang Weidong authored
      I met a Bug when I add ip target with the wrong ip address:
      
      echo +500.500.500.500 > /sys/class/net/bond0/bonding/arp_ip_target
      
      the wrong ip address will transfor to 245.245.245.244 and add
      to the ip target success, it is uncorrect, so I add checks to avoid
      adding wrong address.
      
      The in4_pton() will set wrong ip address to 0.0.0.0, it will return by
      the next check and will not add to ip target.
      
      v2
      According Veaceslav's opinion, simplify the code.
      
      v3
      According Veaceslav's opinion, add broadcast check and make a micro
      definition to package it.
      
      v4
      Solve the problem of the format which David point out.
      Suggested-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Suggested-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9de11a1
    • Jukka Rissanen's avatar
      6lowpan: Uncompression of traffic class field was incorrect · 1188f054
      Jukka Rissanen authored
      If priority/traffic class field in IPv6 header is set (seen when
      using ssh), the uncompression sets the TC and Flow fields incorrectly.
      
      Example:
      
      This is IPv6 header of a sent packet. Note the priority/TC (=1) in
      the first byte.
      
      00000000: 61 00 00 00 00 2c 06 40 fe 80 00 00 00 00 00 00
      00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00
      00000020: 02 1e ab ff fe 4c 52 57
      
      This gets compressed like this in the sending side
      
      00000000: 72 31 04 06 02 1e ab ff fe 4c 52 57 ec c2 00 16
      00000010: aa 2d fe 92 86 4e be c6 ....
      
      In the receiving end, the packet gets uncompressed to this
      IPv6 header
      
      00000000: 60 06 06 02 00 2a 1e 40 fe 80 00 00 00 00 00 00
      00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00
      00000020: ab ff fe 4c 52 57 ec c2
      
      First four bytes are set incorrectly and we have also lost
      two bytes from destination address.
      
      The fix is to switch the case values in switch statement
      when checking the TC field.
      Signed-off-by: default avatarJukka Rissanen <jukka.rissanen@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1188f054
    • Erik Hugne's avatar
      tipc: fix dereference before check warning · 3db0a197
      Erik Hugne authored
      This fixes the following Smatch warning:
      net/tipc/link.c:2364 tipc_link_recv_fragment()
          warn: variable dereferenced before check '*head' (see line 2361)
      
      A null pointer might be passed to skb_try_coalesce if
      a malicious sender injects orphan fragments on a link.
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3db0a197
  5. 14 Nov, 2013 7 commits
    • Eric Dumazet's avatar
      ipv4: fix possible seqlock deadlock · c9e90429
      Eric Dumazet authored
      ip4_datagram_connect() being called from process context,
      it should use IP_INC_STATS() instead of IP_INC_STATS_BH()
      otherwise we can deadlock on 32bit arches, or get corruptions of
      SNMP counters.
      
      Fixes: 584bdf8c ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c9e90429
    • Geyslan G. Bem's avatar
      net/hsr: Fix possible leak in 'hsr_get_node_status()' · 84a035f6
      Geyslan G. Bem authored
      If 'hsr_get_node_data()' returns error, going directly to 'fail' label
      doesn't free the memory pointed by 'skb_out'.
      Signed-off-by: default avatarGeyslan G. Bem <geyslan@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84a035f6
    • David S. Miller's avatar
      Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · 8422d1f1
      David S. Miller authored
      John W. Linville says:
      
      ====================
      pull request: wireless 2013-11-14
      
      Please pull this batch of fixes intended for the 3.13 stream!
      
      Amitkumar Karwar offers a quartet of mwifiex fixes, including an
      endian fix and three fixes for invalid memory access.
      
      Avinash Patil trims the packet length value for packets received from
      an SDIO interface.
      
      Colin Ian King fixes a NULL pointer dereference in the rtlwifi
      efuse code.
      
      Dan Carpenter cleans-up an mwifiex integer underflow, a potential
      libertas oops, a memory corrupion bug in wcn36xx, and a locking issue
      also in wcn36xx.
      
      Dan Williams helps prism54 devices to avoid being misclassified as
      Ethernet devices.
      
      Felipe Pena fixes a couple of typo errors, one in rt2x00 and the
      other in rtlwifi.
      
      Janusz Dziedzic corrects a pair of DFS-related problems in ath9k.
      
      Larry Finger patches three rtlwifi drivers to correctly report signal
      strength even for an unassociated AP.
      
      Mark Cave-Ayland rewrites some endian-illiterate packet type extraction
      code in rtlwifi.
      
      Stanislaw Gruszka addresses an rt2x00 regression related to setting
      HT station WCID and AMPDU density parameters.
      
      Sujith Manoharan corrects the initvals settings for AR9485.
      
      Ujjal Roy patches an obscure bit of code in mwifiex that was using
      the wrong definition of eth_hdr when briding patches in AP mode.
      
      Wei Yongjun fixes a couple of bugs: one is a return code handling
      bug in libertas; and, the other is a locking issue in wcn36xx.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8422d1f1
    • Michael Dalton's avatar
      virtio-net: mergeable buffer size should include virtio-net header · 5061de36
      Michael Dalton authored
      Commit 2613af0e ("virtio_net: migrate mergeable rx buffers to page
      frag allocators") changed the mergeable receive buffer size from PAGE_SIZE
      to MTU-size. However, the merge buffer size does not take into account the
      size of the virtio-net header. Consequently, packets that are MTU-size
      will take two buffers intead of one (to store the virtio-net header),
      substantially decreasing the throughput of MTU-size traffic due to TCP
      window / SKB truesize effects.
      
      This commit changes the mergeable buffer size to include the virtio-net
      header. The buffer size is cacheline-aligned because skb_page_frag_refill
      will not automatically align the requested size.
      
      Benchmarks taken from an average of 5 netperf 30-second TCP_STREAM runs
      between two QEMU VMs on a single physical machine. Each VM has two VCPUs and
      vhost enabled. All VMs and vhost threads run in a single 4 CPU cgroup
      cpuset, using cgroups to ensure that other processes in the system will not
      be scheduled on the benchmark CPUs. Transmit offloads and mergeable receive
      buffers are enabled, but guest_tso4 / guest_csum are explicitly disabled to
      force MTU-sized packets on the receiver.
      
      next-net trunk before 2613af0e (PAGE_SIZE buf): 3861.08Gb/s
      net-next trunk (MTU 1500- packet uses two buf due to size bug): 4076.62Gb/s
      net-next trunk (MTU 1480- packet fits in one buf): 6301.34Gb/s
      net-next trunk w/ size fix (MTU 1500 - packet fits in one buf): 6445.44Gb/s
      Suggested-by: default avatarEric Northup <digitaleric@google.com>
      Signed-off-by: default avatarMichael Dalton <mwdalton@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5061de36
    • Chris Metcalf's avatar
      connector: improved unaligned access error fix · 1ca1a4cf
      Chris Metcalf authored
      In af3e095a, Erik Jacobsen fixed one type of unaligned access
      bug for ia64 by converting a 64-bit write to use put_unaligned().
      Unfortunately, since gcc will convert a short memset() to a series
      of appropriately-aligned stores, the problem is now visible again
      on tilegx, where the memset that zeros out proc_event is converted
      to three 64-bit stores, causing an unaligned access panic.
      
      A better fix for the original problem is to ensure that proc_event
      is aligned to 8 bytes here.  We can do that relatively easily by
      arranging to start the struct cn_msg aligned to 8 bytes and then
      offset by 4 bytes.  Doing so means that the immediately following
      proc_event structure is then correctly aligned to 8 bytes.
      
      The result is that the memset() stores are now aligned, and as an
      added benefit, we can remove the put_unaligned() calls in the code.
      Signed-off-by: default avatarChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ca1a4cf
    • Maciej Żenczykowski's avatar
      pkt_sched: fq: change classification of control packets · 2abc2f07
      Maciej Żenczykowski authored
      Initial sch_fq implementation copied code from pfifo_fast to classify
      a packet as a high prio packet.
      
      This clashes with setups using PRIO with say 7 bands, as one of the
      band could be incorrectly (mis)classified by FQ.
      
      Packets would be queued in the 'internal' queue, and no pacing ever
      happen for this special queue.
      
      Fixes: afe4fd06 ("pkt_sched: fq: Fair Queue packet scheduler")
      Signed-off-by: default avatarMaciej Żenczykowski <maze@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2abc2f07
    • hahnjo's avatar
      alx: Reset phy speed after resume · b54629e2
      hahnjo authored
      This fixes bug 62491 (https://bugzilla.kernel.org/show_bug.cgi?id=62491).
      After resuming some users got the following error flooding the kernel log:
      alx 0000:02:00.0: invalid PHY speed/duplex: 0xffff
      Signed-off-by: default avatarJonas Hahnfeld <linux@hahnjo.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b54629e2