1. 21 Jan, 2011 2 commits
    • Neil Horman's avatar
      bonding: Ensure that we unshare skbs prior to calling pskb_may_pull · b3053251
      Neil Horman authored
      Recently reported oops:
      
      kernel BUG at net/core/skbuff.c:813!
      invalid opcode: 0000 [#1] SMP
      last sysfs file: /sys/devices/virtual/net/bond0/broadcast
      CPU 8
      Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
      ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
      i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
      ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
      scsi_transport_sas dm_mod [last unloaded: microcode]
      
      Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
      ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
      i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
      ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
      scsi_transport_sas dm_mod [last unloaded: microcode]
      Pid: 0, comm: swapper Not tainted 2.6.32-71.el6.x86_64 #1 BladeCenter HS22
      -[7870AC1]-
      RIP: 0010:[<ffffffff81405b16>]  [<ffffffff81405b16>]
      pskb_expand_head+0x36/0x1e0
      RSP: 0018:ffff880028303b70  EFLAGS: 00010202
      RAX: 0000000000000002 RBX: ffff880c6458ec80 RCX: 0000000000000020
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880c6458ec80
      RBP: ffff880028303bc0 R08: ffffffff818a6180 R09: ffff880c6458ed64
      R10: ffff880c622b36c0 R11: 0000000000000400 R12: 0000000000000000
      R13: 0000000000000180 R14: ffff880c622b3000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 00000038653452a4 CR3: 0000000001001000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process swapper (pid: 0, threadinfo ffff8806649c2000, task ffff880c64f16ab0)
      Stack:
       ffff880028303bc0 ffffffff8104fff9 000000000000001c 0000000100000000
      <0> ffff880000047d80 ffff880c6458ec80 000000000000001c ffff880c6223da00
      <0> ffff880c622b3000 0000000000000000 ffff880028303c10 ffffffff81407f7a
      Call Trace:
      <IRQ>
       [<ffffffff8104fff9>] ? __wake_up_common+0x59/0x90
       [<ffffffff81407f7a>] __pskb_pull_tail+0x2aa/0x360
       [<ffffffffa0244530>] bond_arp_rcv+0x2c0/0x2e0 [bonding]
       [<ffffffff814a0857>] ? packet_rcv+0x377/0x440
       [<ffffffff8140f21b>] netif_receive_skb+0x2db/0x670
       [<ffffffff8140f788>] napi_skb_finish+0x58/0x70
       [<ffffffff8140fc89>] napi_gro_receive+0x39/0x50
       [<ffffffffa01286eb>] ixgbe_clean_rx_irq+0x35b/0x900 [ixgbe]
       [<ffffffffa01290f6>] ixgbe_clean_rxtx_many+0x136/0x240 [ixgbe]
       [<ffffffff8140fe53>] net_rx_action+0x103/0x210
       [<ffffffff81073bd7>] __do_softirq+0xb7/0x1e0
       [<ffffffff810d8740>] ? handle_IRQ_event+0x60/0x170
       [<ffffffff810142cc>] call_softirq+0x1c/0x30
       [<ffffffff81015f35>] do_softirq+0x65/0xa0
       [<ffffffff810739d5>] irq_exit+0x85/0x90
       [<ffffffff814cf915>] do_IRQ+0x75/0xf0
       [<ffffffff81013ad3>] ret_from_intr+0x0/0x11
       <EOI>
       [<ffffffff8101bc01>] ? mwait_idle+0x71/0xd0
       [<ffffffff814cd80a>] ? atomic_notifier_call_chain+0x1a/0x20
       [<ffffffff81011e96>] cpu_idle+0xb6/0x110
       [<ffffffff814c17c8>] start_secondary+0x1fc/0x23f
      
      Resulted from bonding driver registering packet handlers via dev_add_pack and
      then trying to call pskb_may_pull. If another packet handler (like for AF_PACKET
      sockets) gets called first, the delivered skb will have a user count > 1, which
      causes pskb_may_pull to BUG halt when it does its skb_shared check.  Fix this by
      calling skb_share_check prior to the may_pull call sites in the bonding driver
      to clone the skb when needed.  Tested by myself and the reported successfully.
      
      Signed-off-by: Neil Horman
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: Jay Vosburgh <fubar@us.ibm.com>
      CC: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: default avatarAndy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3053251
    • Dimitris Michailidis's avatar
      cxgb4: fix reported state of interfaces without link · 6a3c869a
      Dimitris Michailidis authored
      Currently tools like ip and ifconfig report incorrect state for cxgb4
      interfaces that are up but do not have link and do so until first link
      establishment.  This is because the initial netif_carrier_off call is
      before register_netdev and it needs to be after to be fully effective.
      Fix this by moving netif_carrier_off into .ndo_open.
      Signed-off-by: default avatarDimitris Michailidis <dm@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a3c869a
  2. 20 Jan, 2011 1 commit
  3. 19 Jan, 2011 16 commits
  4. 18 Jan, 2011 3 commits
  5. 17 Jan, 2011 3 commits
  6. 16 Jan, 2011 14 commits
  7. 14 Jan, 2011 1 commit
    • Daniel Hellstrom's avatar
      GRETH: resolve SMP issues and other problems · 0f73f2c5
      Daniel Hellstrom authored
      Fixes the following:
      1. POLL should not enable IRQ when work is not completed
      2. No locking between TX descriptor cleaning and XMIT descriptor handling
      3. No locking between RX POLL and XMIT modifying control register
      4. Since TX cleaning (called from POLL) is running in parallel with XMIT
         unnecessary locking is needed.
      5. IRQ handler looks at RX frame status solely, this is wrong when IRQ is
         temporarily disabled (in POLL), and when IRQ is shared.
      6. IRQ handler clears IRQ status, which is unnecessary
      7. TX queue was stopped in preventing cause when not MAX_SKB_FRAGS+1
         descriptors were available after a SKB been scheduled by XMIT. Instead
         the TX queue is stopped first when not enough descriptors are available
         upon entering XMIT.
      
      It was hard to split up this patch in smaller pieces since all are tied
      together somehow.
      
      Note the RX flag used in the interrupt handler does not signal that
      interrupt was asserted, but that a frame was received. Same goes for TX.
      Also, IRQ is not asserted when the RX flag is set before enabling IRQ
      enable until a new frame is received. So extra care must be taken to
      avoid enabling IRQ and all descriptors are already used, hence dead lock
      will upon us. See new POLL implementation that enableds IRQ then look at
      the RX flag to determine if one or more IRQs may have been missed. TX/RX
      flags are cleared before handling previously enabled descriptors, this
      ensures that the RX/TX flags are valid when determining if IRQ should be
      turned on again.
      
      By moving TX cleaning from POLL to XMIT in the standard case, removes some
      locking trouble. Enabling TX cleaning from poll only when not enough TX
      descriptors are available is safe because the TX queue is at the same time
      stopped, thus XMIT will not be called. The TX queue is woken up again when
      enough descriptrs are available.
      
      TX Frames are always enabled with IRQ, however the TX IRQ Enable flag will
      not be enabled until XMIT must wait for free descriptors.
      
      Locking RX and XMIT parts of the driver from each other is needed because
      the RX/TX enable bits share the same register.
      Signed-off-by: default avatarDaniel Hellstrom <daniel@gaisler.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f73f2c5