1. 13 May, 2016 2 commits
    • Alexander Duyck's avatar
      udp: Resolve NULL pointer dereference over flow-based vxlan device · ed7cbbce
      Alexander Duyck authored
      While testing an OpenStack configuration using VXLANs I saw the following
      call trace:
      
       RIP: 0010:[<ffffffff815fad49>] udp4_lib_lookup_skb+0x49/0x80
       RSP: 0018:ffff88103867bc50  EFLAGS: 00010286
       RAX: ffff88103269bf00 RBX: ffff88103269bf00 RCX: 00000000ffffffff
       RDX: 0000000000004300 RSI: 0000000000000000 RDI: ffff880f2932e780
       RBP: ffff88103867bc60 R08: 0000000000000000 R09: 000000009001a8c0
       R10: 0000000000004400 R11: ffffffff81333a58 R12: ffff880f2932e794
       R13: 0000000000000014 R14: 0000000000000014 R15: ffffe8efbfd89ca0
       FS:  0000000000000000(0000) GS:ffff88103fd80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000488 CR3: 0000000001c06000 CR4: 00000000001426e0
       Stack:
        ffffffff81576515 ffffffff815733c0 ffff88103867bc98 ffffffff815fcc17
        ffff88103269bf00 ffffe8efbfd89ca0 0000000000000014 0000000000000080
        ffffe8efbfd89ca0 ffff88103867bcc8 ffffffff815fcf8b ffff880f2932e794
       Call Trace:
        [<ffffffff81576515>] ? skb_checksum+0x35/0x50
        [<ffffffff815733c0>] ? skb_push+0x40/0x40
        [<ffffffff815fcc17>] udp_gro_receive+0x57/0x130
        [<ffffffff815fcf8b>] udp4_gro_receive+0x10b/0x2c0
        [<ffffffff81605863>] inet_gro_receive+0x1d3/0x270
        [<ffffffff81589e59>] dev_gro_receive+0x269/0x3b0
        [<ffffffff8158a1b8>] napi_gro_receive+0x38/0x120
        [<ffffffffa0871297>] gro_cell_poll+0x57/0x80 [vxlan]
        [<ffffffff815899d0>] net_rx_action+0x160/0x380
        [<ffffffff816965c7>] __do_softirq+0xd7/0x2c5
        [<ffffffff8107d969>] run_ksoftirqd+0x29/0x50
        [<ffffffff8109a50f>] smpboot_thread_fn+0x10f/0x160
        [<ffffffff8109a400>] ? sort_range+0x30/0x30
        [<ffffffff81096da8>] kthread+0xd8/0xf0
        [<ffffffff81693c82>] ret_from_fork+0x22/0x40
        [<ffffffff81096cd0>] ? kthread_park+0x60/0x60
      
      The following trace is seen when receiving a DHCP request over a flow-based
      VXLAN tunnel.  I believe this is caused by the metadata dst having a NULL
      dev value and as a result dev_net(dev) is causing a NULL pointer dereference.
      
      To resolve this I am replacing the check for skb_dst(skb)->dev with just
      skb->dev.  This makes sense as the callers of this function are usually in
      the receive path and as such skb->dev should always be populated.  In
      addition other functions in the area where these are called are already
      using dev_net(skb->dev) to determine the namespace the UDP packet belongs
      in.
      
      Fixes: 63058308 ("udp: Add udp6_lib_lookup_skb and udp4_lib_lookup_skb")
      Signed-off-by: default avatarAlexander Duyck <aduyck@mirantis.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed7cbbce
    • Eric Dumazet's avatar
      sunrpc: set SOCK_FASYNC · b4411457
      Eric Dumazet authored
      sunrpc is using SOCKWQ_ASYNC_NOSPACE without setting SOCK_FASYNC,
      so the recent optimizations done in sk_set_bit() and sk_clear_bit()
      broke it.
      
      There is still the risk that a subsequent sock_fasync() call
      would clear SOCK_FASYNC, but sunrpc does not use this yet.
      
      Fixes: 9317bb69 ("net: SOCKWQ_ASYNC_NOSPACE optimizations")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarJiri Pirko <jiri@resnulli.us>
      Reported-by: default avatarHuang, Ying <ying.huang@intel.com>
      Tested-by: default avatarJiri Pirko <jiri@resnulli.us>
      Tested-by: default avatarHuang, Ying <ying.huang@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4411457
  2. 12 May, 2016 28 commits
  3. 11 May, 2016 10 commits
    • David S. Miller's avatar
      Merge branch 'mlx5-next' · 6a47a570
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox 100G mlx5 CQE compression
      
      Introducing ConnectX-4 CQE (Completion Queue Entry) compression feature
      for mlx5 etherent driver.
      
      CQE Compressing reduces PCI overhead by coalescing and compressing multiple CQEs into a
      single merged CQE.  Successful compressing improves message rate especially for small packet
      traffic.
      
      CQE Compressing in details:
      
      Instead of writing full CQEs to memory, multiple almost identical CQEs are merged and compressed.
      Information that is shared between the CQEs is written once, regardless of the number of
      compressed CQEs.  In addition, only the unique information (small amount of bytes compared to
      full CQE size) is written per CQE.
      
      CQE Compression Block:
      
      This block contains multiple compressed CQEs.  CQE Compression Block contains a single copy
      of CQEs properties which are shared between all the compressed CQEs (called Title, see below)
      and multiple mini CQEs (CQEs in compressed form).
      
      Title:
      
      The Title holds information which is shared between all the compressed CQEs in the CQE Compression
      Block.  In each Compression Block there is only a single Title regardless of the number
      of compressed CQEs.
      
      Mini CQE:
      
      A CQE in compressed form that holds some data needed to extract a single full CQE, for example
      8 Bytes instead of 64 Bytes.
      The shared information between all compressed CQEs, which belong to the same CQE Compression
      Block called Title, is written once, and only the unique information in each compressed
      CQE, for example 8 bytes, is written per compressed CQE, called mini CQE.
      
      Since CQE Compression can add overhead to the software (CPU),
      it will be only enabled on "weak/slow" PCI slots, where it can actually help.
      
      Applied on top: c047c3b1 ('netfilter: conntrack: remove uninitialized shadow variable')
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a47a570
    • Saeed Mahameed's avatar
      net/mlx5e: Enable CQE compression when PCI is slower than link · b797a684
      Saeed Mahameed authored
      We turn the feature ON, only for servers with PCI BW < MAX LINK BW, as it
      helps reducing PCI pressure on weak PCI slots, but it adds some software
      overhead.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b797a684
    • Tariq Toukan's avatar
      net/mlx5e: Expand WQE stride when CQE compression is enabled · d9d9f156
      Tariq Toukan authored
      Make the MPWQE/Striding RQ default configuration dynamic and not
      statically set at compile time.  Now at driver load we set
      stride size and num strides dynamically.
      
      By default we use same values as before, but when CQE compression
      is enabled, we set larger stride size to benefit from CQE
      compression for larger packets.
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9d9f156
    • Tariq Toukan's avatar
      net/mlx5e: CQE compression · 7219ab34
      Tariq Toukan authored
      CQE compression feature is meant to save PCIe bandwidth by
      compressing few CQEs into smaller amount of bytes on PCIe.
      CQE compression can be selectively enabled per CQ.  By default
      is disabled for now and will be enabled later on.
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7219ab34
    • David S. Miller's avatar
      Merge branch 'more-dsa-probing' · c1869d58
      David S. Miller authored
      Andrew Lunn says:
      
      ====================
      More enabler patches for DSA probing
      
      The complete set of patches for the reworked DSA probing is too big to
      post as once. These subset contains some enablers which are easy to
      review.
      
      Eventually, the Marvell driver will instantiate its own internal MDIO
      bus, rather than have the framework do it, thus allows devices on the
      bus to be listed in the device tree. Initialize the main mutex as soon
      as it is created, to avoid lifetime issues with the mdio bus.
      
      A previous patch renamed all the DSA probe functions to make room for
      a true device probe. However the recent merging of all the Marvell
      switch drivers resulted in mv88e6xxx going back to the old probe
      name. Rename it again, so we can have a driver probe function.
      
      Add minimum support for the Marvell switch driver to probe as an MDIO
      device, as well as an DSA driver. Later patches will then register
      this device with the new DSA core framework.
      
      Move the GPIO reset code out of the DSA code. Different drivers may
      need different reset mechanisms, e.g. via a reset controller for
      memory mapped devices. Don't clutter up the core with this. Let each
      driver implement what it needs.
      
      master_dev is no longer needed in the switch drivers, since they have
      access to a device pointer from the probe function. Remove it.
      
      Let the switch parse the eeprom length from its one device tree
      node. This is required with the new binding when the central DSA
      platform device no longer exists.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1869d58
    • Andrew Lunn's avatar
      dsa: mv88e6xxx: Handle eeprom-length property · f8cd8753
      Andrew Lunn authored
      A switch can export an attached EEPROM using the standard ethtool API.
      However the switch itself cannot determine the size of the EEPROM, and
      multiple sizes are allowed. Thus a device tree property is supported
      to indicate the length of the EEPROM. Parse this property during
      device probe, and implement a callback function to retrieve it.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8cd8753
    • Andrew Lunn's avatar
      dsa: Rename switch chip data to cd · ff04955c
      Andrew Lunn authored
      The dsa_switch structure contains a dsa_chip_data member called pd.
      However in the rest of the code, pd is used for dsa_platform_data.
      This is confusing. Rename it cd, which is already often used in dsa.c
      and slave.c for this data type.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff04955c
    • Andrew Lunn's avatar
      dsa: Remove master_dev from switch structure · c33063d6
      Andrew Lunn authored
      The switch drivers only use the master_dev member for dev_info()
      messages.  Now that the device is passed to the old style probe, and
      new style drivers are probed as true linux drivers, this is no longer
      needed.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c33063d6
    • Andrew Lunn's avatar
      dsa: Move gpio reset into switch driver · 52638f71
      Andrew Lunn authored
      Resetting the switch is something the driver does, not the framework.
      So move the parsing of this property into the driver.
      
      There are no in kernel users of this property, so moving it does not
      break anything. There is however a board which will make use of this
      property making its way into the kernel.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52638f71
    • Andrew Lunn's avatar
      dsa: Add mdio device support to Marvell switches · 14c7b3c3
      Andrew Lunn authored
      Allow Marvell switches to be mdio devices. Currently the driver just
      allocate the private structure and detects what device is on the
      bus. Later patches will make them register with the DSA framework.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14c7b3c3