1. 15 Jan, 2017 11 commits
    • Michal Tesar's avatar
      igmp: Make igmp group member RFC 3376 compliant · 0d431f94
      Michal Tesar authored
      [ Upstream commit 7ababb78 ]
      
      5.2. Action on Reception of a Query
      
       When a system receives a Query, it does not respond immediately.
       Instead, it delays its response by a random amount of time, bounded
       by the Max Resp Time value derived from the Max Resp Code in the
       received Query message.  A system may receive a variety of Queries on
       different interfaces and of different kinds (e.g., General Queries,
       Group-Specific Queries, and Group-and-Source-Specific Queries), each
       of which may require its own delayed response.
      
       Before scheduling a response to a Query, the system must first
       consider previously scheduled pending responses and in many cases
       schedule a combined response.  Therefore, the system must be able to
       maintain the following state:
      
       o A timer per interface for scheduling responses to General Queries.
      
       o A per-group and interface timer for scheduling responses to Group-
         Specific and Group-and-Source-Specific Queries.
      
       o A per-group and interface list of sources to be reported in the
         response to a Group-and-Source-Specific Query.
      
       When a new Query with the Router-Alert option arrives on an
       interface, provided the system has state to report, a delay for a
       response is randomly selected in the range (0, [Max Resp Time]) where
       Max Resp Time is derived from Max Resp Code in the received Query
       message.  The following rules are then used to determine if a Report
       needs to be scheduled and the type of Report to schedule.  The rules
       are considered in order and only the first matching rule is applied.
      
       1. If there is a pending response to a previous General Query
          scheduled sooner than the selected delay, no additional response
          needs to be scheduled.
      
       2. If the received Query is a General Query, the interface timer is
          used to schedule a response to the General Query after the
          selected delay.  Any previously pending response to a General
          Query is canceled.
      --8<--
      
      Currently the timer is rearmed with new random expiration time for
      every incoming query regardless of possibly already pending report.
      Which is not aligned with the above RFE.
      It also might happen that higher rate of incoming queries can
      postpone the report after the expiration time of the first query
      causing group membership loss.
      
      Now the per interface general query timer is rearmed only
      when there is no pending report already scheduled on that interface or
      the newly selected expiration time is before the already pending
      scheduled report.
      Signed-off-by: default avatarMichal Tesar <mtesar@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d431f94
    • Reiter Wolfgang's avatar
      drop_monitor: consider inserted data in genlmsg_end · 14e8d568
      Reiter Wolfgang authored
      [ Upstream commit 3b48ab22 ]
      
      Final nlmsg_len field update must reflect inserted net_dm_drop_point
      data.
      
      This patch depends on previous patch:
      "drop_monitor: add missing call to genlmsg_end"
      Signed-off-by: default avatarReiter Wolfgang <wr0112358@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14e8d568
    • Reiter Wolfgang's avatar
      drop_monitor: add missing call to genlmsg_end · 81e79164
      Reiter Wolfgang authored
      [ Upstream commit 4200462d ]
      
      Update nlmsg_len field with genlmsg_end to enable userspace processing
      using nlmsg_next helper. Also adds error handling.
      Signed-off-by: default avatarReiter Wolfgang <wr0112358@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81e79164
    • Eli Cohen's avatar
      net/mlx5: Avoid shadowing numa_node · 1ff0308f
      Eli Cohen authored
      [ Upstream commit d151d73d ]
      
      Avoid using a local variable named numa_node to avoid shadowing a public
      one.
      
      Fixes: db058a18 ('net/mlx5_core: Set irq affinity hints')
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ff0308f
    • Noa Osherovich's avatar
      net/mlx5: Check FW limitations on log_max_qp before setting it · 18d971f8
      Noa Osherovich authored
      [ Upstream commit 883371c4 ]
      
      When setting HCA capabilities, set log_max_qp to be the minimum
      between the selected profile's value and the HCA limitation.
      
      Fixes: 938fe83c ('net/mlx5_core: New device capabilities...')
      Signed-off-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18d971f8
    • Florian Fainelli's avatar
      net: stmmac: Fix race between stmmac_drv_probe and stmmac_open · 3f284760
      Florian Fainelli authored
      [ Upstream commit 57016590 ]
      
      There is currently a small window during which the network device registered by
      stmmac can be made visible, yet all resources, including and clock and MDIO bus
      have not had a chance to be set up, this can lead to the following error to
      occur:
      
      [  473.919358] stmmaceth 0000:01:00.0 (unnamed net_device) (uninitialized):
                      stmmac_dvr_probe: warning: cannot get CSR clock
      [  473.919382] stmmaceth 0000:01:00.0: no reset control found
      [  473.919412] stmmac - user ID: 0x10, Synopsys ID: 0x42
      [  473.919429] stmmaceth 0000:01:00.0: DMA HW capability register supported
      [  473.919436] stmmaceth 0000:01:00.0: RX Checksum Offload Engine supported
      [  473.919443] stmmaceth 0000:01:00.0: TX Checksum insertion supported
      [  473.919451] stmmaceth 0000:01:00.0 (unnamed net_device) (uninitialized):
                      Enable RX Mitigation via HW Watchdog Timer
      [  473.921395] libphy: PHY stmmac-1:00 not found
      [  473.921417] stmmaceth 0000:01:00.0 eth0: Could not attach to PHY
      [  473.921427] stmmaceth 0000:01:00.0 eth0: stmmac_open: Cannot attach to
                      PHY (error: -19)
      [  473.959710] libphy: stmmac: probed
      [  473.959724] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 0 IRQ POLL
                      (stmmac-1:00) active
      [  473.959728] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 1 IRQ POLL
                      (stmmac-1:01)
      [  473.959731] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 2 IRQ POLL
                      (stmmac-1:02)
      [  473.959734] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 3 IRQ POLL
                      (stmmac-1:03)
      
      Fix this by making sure that register_netdev() is the last thing being done,
      which guarantees that the clock and the MDIO bus are available.
      
      Fixes: 4bfcbd7a ("stmmac: Move the mdio_register/_unregister in probe/remove")
      Reported-by: default avatarKweh, Hock Leong <hock.leong.kweh@intel.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f284760
    • Daniel Borkmann's avatar
      net, sched: fix soft lockup in tc_classify · 67bce582
      Daniel Borkmann authored
      [ Upstream commit 628185cf ]
      
      Shahar reported a soft lockup in tc_classify(), where we run into an
      endless loop when walking the classifier chain due to tp->next == tp
      which is a state we should never run into. The issue only seems to
      trigger under load in the tc control path.
      
      What happens is that in tc_ctl_tfilter(), thread A allocates a new
      tp, initializes it, sets tp_created to 1, and calls into tp->ops->change()
      with it. In that classifier callback we had to unlock/lock the rtnl
      mutex and returned with -EAGAIN. One reason why we need to drop there
      is, for example, that we need to request an action module to be loaded.
      
      This happens via tcf_exts_validate() -> tcf_action_init/_1() meaning
      after we loaded and found the requested action, we need to redo the
      whole request so we don't race against others. While we had to unlock
      rtnl in that time, thread B's request was processed next on that CPU.
      Thread B added a new tp instance successfully to the classifier chain.
      When thread A returned grabbing the rtnl mutex again, propagating -EAGAIN
      and destroying its tp instance which never got linked, we goto replay
      and redo A's request.
      
      This time when walking the classifier chain in tc_ctl_tfilter() for
      checking for existing tp instances we had a priority match and found
      the tp instance that was created and linked by thread B. Now calling
      again into tp->ops->change() with that tp was successful and returned
      without error.
      
      tp_created was never cleared in the second round, thus kernel thinks
      that we need to link it into the classifier chain (once again). tp and
      *back point to the same object due to the match we had earlier on. Thus
      for thread B's already public tp, we reset tp->next to tp itself and
      link it into the chain, which eventually causes the mentioned endless
      loop in tc_classify() once a packet hits the data path.
      
      Fix is to clear tp_created at the beginning of each request, also when
      we replay it. On the paths that can cause -EAGAIN we already destroy
      the original tp instance we had and on replay we really need to start
      from scratch. It seems that this issue was first introduced in commit
      12186be7 ("net_cls: fix unconfigured struct tcf_proto keeps chaining
      and avoid kernel panic when we use cls_cgroup").
      
      Fixes: 12186be7 ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup")
      Reported-by: default avatarShahar Klein <shahark@mellanox.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarShahar Klein <shahark@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      67bce582
    • Dave Jones's avatar
      ipv6: handle -EFAULT from skb_copy_bits · 58d0d7a4
      Dave Jones authored
      [ Upstream commit a98f9175 ]
      
      By setting certain socket options on ipv6 raw sockets, we can confuse the
      length calculation in rawv6_push_pending_frames triggering a BUG_ON.
      
      RIP: 0010:[<ffffffff817c6390>] [<ffffffff817c6390>] rawv6_sendmsg+0xc30/0xc40
      RSP: 0018:ffff881f6c4a7c18  EFLAGS: 00010282
      RAX: 00000000fffffff2 RBX: ffff881f6c681680 RCX: 0000000000000002
      RDX: ffff881f6c4a7cf8 RSI: 0000000000000030 RDI: ffff881fed0f6a00
      RBP: ffff881f6c4a7da8 R08: 0000000000000000 R09: 0000000000000009
      R10: ffff881fed0f6a00 R11: 0000000000000009 R12: 0000000000000030
      R13: ffff881fed0f6a00 R14: ffff881fee39ba00 R15: ffff881fefa93a80
      
      Call Trace:
       [<ffffffff8118ba23>] ? unmap_page_range+0x693/0x830
       [<ffffffff81772697>] inet_sendmsg+0x67/0xa0
       [<ffffffff816d93f8>] sock_sendmsg+0x38/0x50
       [<ffffffff816d982f>] SYSC_sendto+0xef/0x170
       [<ffffffff816da27e>] SyS_sendto+0xe/0x10
       [<ffffffff81002910>] do_syscall_64+0x50/0xa0
       [<ffffffff817f7cbc>] entry_SYSCALL64_slow_path+0x25/0x25
      
      Handle by jumping to the failure path if skb_copy_bits gets an EFAULT.
      
      Reproducer:
      
      #include <stdio.h>
      #include <stdlib.h>
      #include <string.h>
      #include <unistd.h>
      #include <sys/types.h>
      #include <sys/socket.h>
      #include <netinet/in.h>
      
      #define LEN 504
      
      int main(int argc, char* argv[])
      {
      	int fd;
      	int zero = 0;
      	char buf[LEN];
      
      	memset(buf, 0, LEN);
      
      	fd = socket(AF_INET6, SOCK_RAW, 7);
      
      	setsockopt(fd, SOL_IPV6, IPV6_CHECKSUM, &zero, 4);
      	setsockopt(fd, SOL_IPV6, IPV6_DSTOPTS, &buf, LEN);
      
      	sendto(fd, buf, 1, 0, (struct sockaddr *) buf, 110);
      }
      Signed-off-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58d0d7a4
    • David Ahern's avatar
      net: vrf: Drop conntrack data after pass through VRF device on Tx · 6ac0b381
      David Ahern authored
      [ Upstream commit eb63ecc1 ]
      
      Locally originated traffic in a VRF fails in the presence of a POSTROUTING
      rule. For example,
      
          $ iptables -t nat -A POSTROUTING -s 11.1.1.0/24  -j MASQUERADE
          $ ping -I red -c1 11.1.1.3
          ping: Warning: source address might be selected on device other than red.
          PING 11.1.1.3 (11.1.1.3) from 11.1.1.2 red: 56(84) bytes of data.
          ping: sendmsg: Operation not permitted
      
      Worse, the above causes random corruption resulting in a panic in random
      places (I have not seen a consistent backtrace).
      
      Call nf_reset to drop the conntrack info following the pass through the
      VRF device.  The nf_reset is needed on Tx but not Rx because of the order
      in which NF_HOOK's are hit: on Rx the VRF device is after the real ingress
      device and on Tx it is is before the real egress device. Connection
      tracking should be tied to the real egress device and not the VRF device.
      
      Fixes: 8f58336d ("net: Add ethernet header for pass through VRF device")
      Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ac0b381
    • Dan Carpenter's avatar
      ser_gigaset: return -ENOMEM on error instead of success · 1e5298d4
      Dan Carpenter authored
      [ Upstream commit 93a97c50 ]
      
      If we can't allocate the resources in gigaset_initdriver() then we
      should return -ENOMEM instead of zero.
      
      Fixes: 2869b23e ("[PATCH] drivers/isdn/gigaset: new M101 driver (v2)")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e5298d4
    • stephen hemminger's avatar
      netvsc: reduce maximum GSO size · 33c7b0f7
      stephen hemminger authored
      [ Upstream commit a50af86d ]
      
      Hyper-V (and Azure) support using NVGRE which requires some extra space
      for encapsulation headers. Because of this the largest allowed TSO
      packet is reduced.
      
      For older releases, hard code a fixed reduced value.  For next release,
      there is a better solution which uses result of host offload
      negotiation.
      Signed-off-by: default avatarStephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33c7b0f7
  2. 12 Jan, 2017 29 commits