1. 20 Jun, 2014 2 commits
    • Neal Cardwell's avatar
      tcp: fix tcp_match_skb_to_sack() for unaligned SACK at end of an skb · 2cd0d743
      Neal Cardwell authored
      If there is an MSS change (or misbehaving receiver) that causes a SACK
      to arrive that covers the end of an skb but is less than one MSS, then
      tcp_match_skb_to_sack() was rounding up pkt_len to the full length of
      the skb ("Round if necessary..."), then chopping all bytes off the skb
      and creating a zero-byte skb in the write queue.
      
      This was visible now because the recently simplified TLP logic in
      bef1909e ("tcp: fixing TLP's FIN recovery") could find that 0-byte
      skb at the end of the write queue, and now that we do not check that
      skb's length we could send it as a TLP probe.
      
      Consider the following example scenario:
      
       mss: 1000
       skb: seq: 0 end_seq: 4000  len: 4000
       SACK: start_seq: 3999 end_seq: 4000
      
      The tcp_match_skb_to_sack() code will compute:
      
       in_sack = false
       pkt_len = start_seq - TCP_SKB_CB(skb)->seq = 3999 - 0 = 3999
       new_len = (pkt_len / mss) * mss = (3999/1000)*1000 = 3000
       new_len += mss = 4000
      
      Previously we would find the new_len > skb->len check failing, so we
      would fall through and set pkt_len = new_len = 4000 and chop off
      pkt_len of 4000 from the 4000-byte skb, leaving a 0-byte segment
      afterward in the write queue.
      
      With this new commit, we notice that the new new_len >= skb->len check
      succeeds, so that we return without trying to fragment.
      
      Fixes: adb92db8 ("tcp: Make SACK code to split only at mss boundaries")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Ilpo Jarvinen <ilpo.jarvinen@helsinki.fi>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2cd0d743
    • David S. Miller's avatar
      Revert "net: return actual error on register_queue_kobjects" · 8e4946cc
      David S. Miller authored
      This reverts commit d36a4f4b.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e4946cc
  2. 19 Jun, 2014 2 commits
  3. 18 Jun, 2014 6 commits
    • Jie Liu's avatar
      net: return actual error on register_queue_kobjects · d36a4f4b
      Jie Liu authored
      Return the actual error code if call kset_create_and_add() failed
      
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d36a4f4b
    • Or Gerlitz's avatar
      bonding: Advertize vxlan offload features when supported · 5a7baa78
      Or Gerlitz authored
      When the underlying device supports TCP offloads for VXLAN/UDP
      encapulated traffic, we need to reflect that through the hw_enc_features
      field of the bonding net-device. This will cause the xmit path
      in the core networking stack to provide bonding with encapsulated
      GSO frames to offload into the HW etc.
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a7baa78
    • Mirko Lindner's avatar
      skge: Added FS A8NE-FM to the list of 32bit DMA boards · ee14eb7b
      Mirko Lindner authored
      Added FUJITSU SIEMENS A8NE-FM to the list of 32bit DMA boards
      
      >From Tomi O.:
      After I added an entry to this MB into the skge.c
      driver in order to enable the mentioned 64bit dma disable quirk,
      the network data corruptions ended and everything is fine again.
      Signed-off-by: default avatarMirko Lindner <mlindner@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee14eb7b
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 3a3ec1b2
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      netfilter fixes for net
      
      The following patchset contains netfilter updates for your net tree,
      they are:
      
      1) Fix refcount leak when dumping the dying/unconfirmed conntrack lists,
         from Florian Westphal.
      
      2) Fix crash in NAT when removing a netnamespace, also from Florian.
      
      3) Fix a crash in IPVS when trying to remove an estimator out of the
         sysctl scope, from Julian Anastasov.
      
      4) Add zone attribute to the routing to calculate the message size in
         ctnetlink events, from Ken-ichirou MATSUZAWA.
      
      5) Another fix for the dying/unconfirmed list which was preventing to
         dump more than one memory page of entries (~17 entries in x86_64).
      
      6) Fix missing RCU-safe list insertion in the rule replacement code
         in nf_tables.
      
      7) Since the new transaction infrastructure is in place, we have to
         upgrade the chain use counter from u16 to u32 to avoid overflow
         after more than 2^16 rules are added.
      
      8) Fix refcount leak when replacing rule in nf_tables. This problem
         was also introduced in new transaction.
      
      9) Call the ->destroy() callback when releasing nft-xt rules to fix
         module refcount leaks.
      
      10) Set the family in the netlink messages that contain set elements
          in nf_tables to make it consistent with other object types.
      
      11) Don't dump NAT port information if it is unset in nft_nat.
      
      12) Update the MAINTAINERS file, I have merged the ebtables entry
          into netfilter. While at it, also removed the netfilter users
          mailing list, the development list should be enough.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a3ec1b2
    • Pablo Neira Ayuso's avatar
      MAINTAINERS: merge ebtables into netfilter entry · db9cf3a3
      Pablo Neira Ayuso authored
      Moreover, remove reference to the netfilter users mailing list,
      so they don't receive patches.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      db9cf3a3
    • Fugang Duan's avatar
      net: fec: Don't clear IPV6 header checksum field when IP accelerator enable · 62a02c98
      Fugang Duan authored
      The commit 96c50caa (net: fec: Enable IP header hardware checksum)
      enable HW IP header checksum for IPV4 and IPV6, which causes IPV6 TCP/UDP
      cannot work. (The issue is reported by Russell King)
      
      For FEC IP header checksum function: Insert IP header checksum. This "IINS"
      bit is written by the user. If set, IP accelerator calculates the IP header
      checksum and overwrites the IINS corresponding header field with the calculated
      value. The checksum field must be cleared by user, otherwise the checksum
      always is 0xFFFF.
      
      So the previous patch clear IP header checksum field regardless of IP frame
      type.
      
      In fact, IP HW detect the packet as IPV6 type, even if the "IINS" bit is set,
      the IP accelerator is not triggered to calculates IPV6 header checksum because
      IPV6 frame format don't have checksum.
      
      So this results in the IPV6 frame being corrupted.
      
      The patch just add software detect the current packet type, if it is IPV6
      frame, it don't clear IP header checksum field.
      
      Cc: Russell King <linux@arm.linux.org.uk>
      Reported-and-tested-by: default avatarRussell King <linux@arm.linux.org.uk>
      Signed-off-by: default avatarFugang Duan <B38611@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      62a02c98
  4. 17 Jun, 2014 10 commits
    • Jean Delvare's avatar
      ptp: ptp_pch depends on x86_32 · bc56151d
      Jean Delvare authored
      The ptp_pch driver is for a companion chip to the Intel Atom E600
      series processors. These are 32-bit x86 processors so the driver is
      only needed on X86_32.
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc56151d
    • Dave Jones's avatar
      hyperv: fix apparent cut-n-paste error in send path teardown · 2f18423d
      Dave Jones authored
      c25aaf81: "hyperv: Enable sendbuf mechanism on the send path" added
      some teardown code that looks like it was copied from the recieve path
      above, but missed a variable name replacement.
      Signed-off-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f18423d
    • Dave Jones's avatar
      tcp: remove unnecessary tcp_sk assignment. · 17846376
      Dave Jones authored
      This variable is overwritten by the child socket assignment before
      it ever gets used.
      Signed-off-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17846376
    • Chris Metcalf's avatar
      net: tile: fix unused variable warning · 9ebe2435
      Chris Metcalf authored
      'i' is unused in tile_net_dev_init() after commit d581ebf5
      ("net: tile: Use helpers from linux/etherdevice.h to check/set MAC").
      Signed-off-by: default avatarChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ebe2435
    • Christian Riesch's avatar
      ptp: In the testptp utility, use clock_adjtime from glibc when available · 42e1358e
      Christian Riesch authored
      clock_adjtime was included in glibc 2.14. _GNU_SOURCE must be defined
      to make it available.
      Signed-off-by: default avatarChristian Riesch <christian.riesch@omicron.at>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      42e1358e
    • Jean Delvare's avatar
      isdn: hisax: Drop duplicate Kconfig entry · ddc6fbd8
      Jean Delvare authored
      There are 2 HISAX_AVM_A1_PCMCIA Kconfig entries. The kbuild system
      ignores the second one, and apparently nobody noticed the problem so
      far, so let's remove that second entry.
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Cc: Karsten Keil <isdn@linux-pingi.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ddc6fbd8
    • Jean Delvare's avatar
      isdn: hisax: Merge Kconfig ifs · a1c33346
      Jean Delvare authored
      The first half of the HiSax config options is presented if
      ISDN_DRV_HISAX!=n, while the second half of the options is presented
      if ISDN_DRV_HISAX. That's the same, so merge both conditionals.
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Cc: Karsten Keil <isdn@linux-pingi.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a1c33346
    • Tyler Hall's avatar
      slcan: Port write_wakeup deadlock fix from slip · a8e83b17
      Tyler Hall authored
      The commit "slip: Fix deadlock in write_wakeup" fixes a deadlock caused
      by a change made in both slcan and slip. This is a direct port of that
      fix.
      Signed-off-by: default avatarTyler Hall <tylerwhall@gmail.com>
      Cc: Oliver Hartkopp <socketcan@hartkopp.net>
      Cc: Andre Naujoks <nautsch2@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8e83b17
    • Tyler Hall's avatar
      slip: Fix deadlock in write_wakeup · 661f7fda
      Tyler Hall authored
      Use schedule_work() to avoid potentially taking the spinlock in
      interrupt context.
      
      Commit cc9fa74e ("slip/slcan: added locking in wakeup function") added
      necessary locking to the wakeup function and 367525c8/ddcde142 ("can:
      slcan: Fix spinlock variant") converted it to spin_lock_bh() because the lock
      is also taken in timers.
      
      Disabling softirqs is not sufficient, however, as tty drivers may call
      write_wakeup from interrupt context. This driver calls tty->ops->write() with
      its spinlock held, which may immediately cause an interrupt on the same CPU and
      subsequent spin_bug().
      
      Simply converting to spin_lock_irq/irqsave() prevents this deadlock, but
      causes lockdep to point out a possible circular locking dependency
      between these locks:
      
      (&(&sl->lock)->rlock){-.....}, at: slip_write_wakeup
      (&port_lock_key){-.....}, at: serial8250_handle_irq.part.13
      
      The slip transmit is holding the slip spinlock when calling the tty write.
      This grabs the port lock. On an interrupt, the handler grabs the port
      lock and calls write_wakeup which grabs the slip lock. This could be a
      problem if a serial interrupt occurs on another CPU during the slip
      transmit.
      
      To deal with these issues, don't grab the lock in the wakeup function by
      deferring the writeout to a workqueue. Also hold the lock during close
      when de-assigning the tty pointer to safely disarm the worker and
      timers.
      
      This bug is easily reproducible on the first transmit when slip is
      used with the standard 8250 serial driver.
      
      [<c0410b7c>] (spin_bug+0x0/0x38) from [<c006109c>] (do_raw_spin_lock+0x60/0x1d0)
       r5:eab27000 r4:ec02754c
      [<c006103c>] (do_raw_spin_lock+0x0/0x1d0) from [<c04185c0>] (_raw_spin_lock+0x28/0x2c)
       r10:0000001f r9:eabb814c r8:eabb8140 r7:40070193 r6:ec02754c r5:eab27000
       r4:ec02754c r3:00000000
      [<c0418598>] (_raw_spin_lock+0x0/0x2c) from [<bf3a0220>] (slip_write_wakeup+0x50/0xe0 [slip])
       r4:ec027540 r3:00000003
      [<bf3a01d0>] (slip_write_wakeup+0x0/0xe0 [slip]) from [<c026e420>] (tty_wakeup+0x48/0x68)
       r6:00000000 r5:ea80c480 r4:eab27000 r3:bf3a01d0
      [<c026e3d8>] (tty_wakeup+0x0/0x68) from [<c028a8ec>] (uart_write_wakeup+0x2c/0x30)
       r5:ed68ea90 r4:c06790d8
      [<c028a8c0>] (uart_write_wakeup+0x0/0x30) from [<c028dc44>] (serial8250_tx_chars+0x114/0x170)
      [<c028db30>] (serial8250_tx_chars+0x0/0x170) from [<c028dffc>] (serial8250_handle_irq+0xa0/0xbc)
       r6:000000c2 r5:00000060 r4:c06790d8 r3:00000000
      [<c028df5c>] (serial8250_handle_irq+0x0/0xbc) from [<c02933a4>] (dw8250_handle_irq+0x38/0x64)
       r7:00000000 r6:edd2f390 r5:000000c2 r4:c06790d8
      [<c029336c>] (dw8250_handle_irq+0x0/0x64) from [<c028d2f4>] (serial8250_interrupt+0x44/0xc4)
       r6:00000000 r5:00000000 r4:c06791c4 r3:c029336c
      [<c028d2b0>] (serial8250_interrupt+0x0/0xc4) from [<c0067fe4>] (handle_irq_event_percpu+0xb4/0x2b0)
       r10:c06790d8 r9:eab27000 r8:00000000 r7:00000000 r6:0000001f r5:edd52980
       r4:ec53b6c0 r3:c028d2b0
      [<c0067f30>] (handle_irq_event_percpu+0x0/0x2b0) from [<c006822c>] (handle_irq_event+0x4c/0x6c)
       r10:c06790d8 r9:eab27000 r8:c0673ae0 r7:c05c2020 r6:ec53b6c0 r5:edd529d4
       r4:edd52980
      [<c00681e0>] (handle_irq_event+0x0/0x6c) from [<c006b140>] (handle_level_irq+0xe8/0x100)
       r6:00000000 r5:edd529d4 r4:edd52980 r3:00022000
      [<c006b058>] (handle_level_irq+0x0/0x100) from [<c00676f8>] (generic_handle_irq+0x30/0x40)
       r5:0000001f r4:0000001f
      [<c00676c8>] (generic_handle_irq+0x0/0x40) from [<c000f57c>] (handle_IRQ+0xd0/0x13c)
       r4:ea997b18 r3:000000e0
      [<c000f4ac>] (handle_IRQ+0x0/0x13c) from [<c00086c4>] (armada_370_xp_handle_irq+0x4c/0x118)
       r8:000003ff r7:ea997b18 r6:ffffffff r5:60070013 r4:c0674dc0
      [<c0008678>] (armada_370_xp_handle_irq+0x0/0x118) from [<c0013840>] (__irq_svc+0x40/0x70)
      Exception stack(0xea997b18 to 0xea997b60)
      7b00:                                                       00000001 20070013
      7b20: 00000000 0000000b 20070013 eab27000 20070013 00000000 ed10103e eab27000
      7b40: c06790d8 ea997b74 ea997b60 ea997b60 c04186c0 c04186c8 60070013 ffffffff
       r9:eab27000 r8:ed10103e r7:ea997b4c r6:ffffffff r5:60070013 r4:c04186c8
      [<c04186a4>] (_raw_spin_unlock_irqrestore+0x0/0x54) from [<c0288fc0>] (uart_start+0x40/0x44)
       r4:c06790d8 r3:c028ddd8
      [<c0288f80>] (uart_start+0x0/0x44) from [<c028982c>] (uart_write+0xe4/0xf4)
       r6:0000003e r5:00000000 r4:ed68ea90 r3:0000003e
      [<c0289748>] (uart_write+0x0/0xf4) from [<bf3a0d20>] (sl_xmit+0x1c4/0x228 [slip])
       r10:ed388e60 r9:0000003c r8:ffffffdd r7:0000003e r6:ec02754c r5:ea717eb8
       r4:ec027000
      [<bf3a0b5c>] (sl_xmit+0x0/0x228 [slip]) from [<c0368d74>] (dev_hard_start_xmit+0x39c/0x6d0)
       r8:eaf163c0 r7:ec027000 r6:ea717eb8 r5:00000000 r4:00000000
      Signed-off-by: default avatarTyler Hall <tylerwhall@gmail.com>
      Cc: Oliver Hartkopp <socketcan@hartkopp.net>
      Cc: Andre Naujoks <nautsch2@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      661f7fda
    • Neil Horman's avatar
      vmxnet3: adjust ring sizes when interface is down · f00e2b0a
      Neil Horman authored
      If ethtool is used to update ring sizes on a vmxnet3 interface that isn't
      running, the change isn't stored, meaning the ring update is effectively is
      ignored and lost without any indication to the user.
      
      Other network drivers store the ring size update so that ring allocation uses
      the new sizes next time the interface is brought up.  This patch modifies
      vmxnet3 to behave this way as well
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: Shreyas Bhatewara <sbhatewara@vmware.com>
      CC: "VMware, Inc." <pv-drivers@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f00e2b0a
  5. 16 Jun, 2014 16 commits
  6. 15 Jun, 2014 4 commits
    • Daniel Borkmann's avatar
      net: sctp: fix permissions for rto_alpha and rto_beta knobs · b58537a1
      Daniel Borkmann authored
      Commit 3fd091e7 ("[SCTP]: Remove multiple levels of msecs
      to jiffies conversions.") has silently changed permissions for
      rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of
      this was to discourage users from tweaking rto_alpha and
      rto_beta knobs in production environments since they are key
      to correctly compute rtt/srtt.
      
      RFC4960 under section 6.3.1. RTO Calculation says regarding
      rto_alpha and rto_beta under rule C3 and C4:
      
        [...]
        C3)  When a new RTT measurement R' is made, set
      
             RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'|
      
             and
      
             SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R'
      
             Note: The value of SRTT used in the update to RTTVAR
             is its value before updating SRTT itself using the
             second assignment. After the computation, update
             RTO <- SRTT + 4 * RTTVAR.
      
        C4)  When data is in flight and when allowed by rule C5
             below, a new RTT measurement MUST be made each round
             trip. Furthermore, new RTT measurements SHOULD be
             made no more than once per round trip for a given
             destination transport address. There are two reasons
             for this recommendation: First, it appears that
             measuring more frequently often does not in practice
             yield any significant benefit [ALLMAN99]; second,
             if measurements are made more often, then the values
             of RTO.Alpha and RTO.Beta in rule C3 above should be
             adjusted so that SRTT and RTTVAR still adjust to
             changes at roughly the same rate (in terms of how many
             round trips it takes them to reflect new values) as
             they would if making only one measurement per
             round-trip and using RTO.Alpha and RTO.Beta as given
             in rule C3. However, the exact nature of these
             adjustments remains a research issue.
        [...]
      
      While it is discouraged to adjust rto_alpha and rto_beta
      and not further specified how to adjust them, the RFC also
      doesn't explicitly forbid it, but rather gives a RECOMMENDED
      default value (rto_alpha=3, rto_beta=2). We have a couple
      of users relying on the old permissions before they got
      changed. That said, if someone really has the urge to adjust
      them, we could allow it with a warning in the log.
      
      Fixes: 3fd091e7 ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b58537a1
    • David S. Miller's avatar
      Merge branch 'csum_fixes' · e4f7ae93
      David S. Miller authored
      Tom Herbert says:
      
      ====================
      Fixes related to some recent checksum modifications.
      
      - Fix GSO constants to match NETIF flags
      - Fix logic in saving checksum complete in __skb_checksum_complete
      - Call __skb_checksum_complete from UDP if we are checksumming over
        whole packet in order to save checksum.
      - Fixes to VXLAN to work correctly with checksum complete
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4f7ae93
    • Tom Herbert's avatar
      vxlan: Checksum fixes · f79b064c
      Tom Herbert authored
      Call skb_pop_rcv_encapsulation and postpull_rcsum for the Ethernet
      header to work properly with checksum complete.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f79b064c
    • Tom Herbert's avatar
      net: add skb_pop_rcv_encapsulation · e5eb4e30
      Tom Herbert authored
      This function is used by UDP encapsulation protocols in RX when
      crossing encapsulation boundary. If ip_summed is set to
      CHECKSUM_UNNECESSARY and encapsulation is not set, change to
      CHECKSUM_NONE since the checksum has not been validated within the
      encapsulation. Clears csum_valid by the same rationale.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5eb4e30