1. 09 Aug, 2016 16 commits
    • Sylwester Nawrocki's avatar
      dm9000: Fix irq trigger type setup on non-dt platforms · a96d3b75
      Sylwester Nawrocki authored
      Commit b5a099c6 "net: ethernet: davicom: fix devicetree irq
      resource" causes an interrupt storm after the ethernet interface
      is activated on S3C24XX platform (ARM non-dt), due to the interrupt
      trigger type not being set properly.
      
      It seems, after adding parsing of IRQ flags in commit 7085a740
      "drivers: platform: parse IRQ flags from resources", there is no path
      for non-dt platforms where irq_set_type callback could be invoked when
      we don't pass the trigger type flags to the request_irq() call.
      
      In case of a board where the regression is seen the interrupt trigger
      type flags are passed through a platform device's resource and it is
      not currently handled properly without passing the irq trigger type
      flags to the request_irq() call.  In case of OF an of_irq_get() call
      within platform_get_irq() function seems to be ensuring required irq_chip
      setup, but there is no equivalent code for non OF/ACPI platforms.
      
      This patch mostly restores irq trigger type setting code which has been
      removed in commit ("net: ethernet: davicom: fix devicetree irq resource").
      
      Fixes: b5a099c6 ("net: ethernet: davicom: fix devicetree irq resource")
      Signed-off-by: default avatarSylwester Nawrocki <s.nawrocki@samsung.com>
      Acked-by: default avatarRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a96d3b75
    • Zhu Yanjun's avatar
      bonding: fix the typo · 0d039f33
      Zhu Yanjun authored
      The message "803.ad" should be "802.3ad".
      Signed-off-by: default avatarZhu Yanjun <zyjzyj2000@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d039f33
    • Grygorii Strashko's avatar
      drivers: net: cpsw: fix kmemleak false-positive reports for sk buffers · 254a49d5
      Grygorii Strashko authored
      Kmemleak reports following false positive memory leaks for each sk
      buffers allocated by CPSW (__netdev_alloc_skb_ip_align()) in
      cpsw_ndo_open() and cpsw_rx_handler():
      
      unreferenced object 0xea915000 (size 2048):
        comm "systemd-network", pid 713, jiffies 4294938323 (age 102.180s)
        hex dump (first 32 bytes):
          00 58 91 ea ff ff ff ff ff ff ff ff ff ff ff ff  .X..............
          ff ff ff ff ff ff fd 0f 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<c0108680>] __kmalloc_track_caller+0x1a4/0x230
          [<c0529eb4>] __alloc_skb+0x68/0x16c
          [<c052c884>] __netdev_alloc_skb+0x40/0x104
          [<bf1ad29c>] cpsw_ndo_open+0x374/0x670 [ti_cpsw]
          [<c053c3d4>] __dev_open+0xb0/0x114
          [<c053c690>] __dev_change_flags+0x9c/0x14c
          [<c053c760>] dev_change_flags+0x20/0x50
          [<c054bdcc>] do_setlink+0x2cc/0x78c
          [<c054c358>] rtnl_setlink+0xcc/0x100
          [<c054b34c>] rtnetlink_rcv_msg+0x184/0x224
          [<c056467c>] netlink_rcv_skb+0xa8/0xc4
          [<c054b1c0>] rtnetlink_rcv+0x2c/0x34
          [<c0564018>] netlink_unicast+0x16c/0x1f8
          [<c0564498>] netlink_sendmsg+0x334/0x348
          [<c052015c>] sock_sendmsg+0x1c/0x2c
          [<c05213e0>] SyS_sendto+0xc0/0xe8
      
      unreferenced object 0xec861780 (size 192):
        comm "softirq", pid 0, jiffies 4294938759 (age 109.540s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          00 00 00 00 00 b0 5a ed 00 00 00 00 00 00 00 00  ......Z.........
        backtrace:
          [<c0107830>] kmem_cache_alloc+0x190/0x208
          [<c052c768>] __build_skb+0x30/0x98
          [<c052c8fc>] __netdev_alloc_skb+0xb8/0x104
          [<bf1abc54>] cpsw_rx_handler+0x68/0x1e4 [ti_cpsw]
          [<bf11aa30>] __cpdma_chan_free+0xa8/0xc4 [davinci_cpdma]
          [<bf11ab98>] __cpdma_chan_process+0x14c/0x16c [davinci_cpdma]
          [<bf11abfc>] cpdma_chan_process+0x44/0x5c [davinci_cpdma]
          [<bf1adc78>] cpsw_rx_poll+0x1c/0x9c [ti_cpsw]
          [<c0539180>] net_rx_action+0x1f0/0x2ec
          [<c003881c>] __do_softirq+0x134/0x258
          [<c0038a00>] do_softirq+0x68/0x70
          [<c0038adc>] __local_bh_enable_ip+0xd4/0xe8
          [<c0640994>] _raw_spin_unlock_bh+0x30/0x34
          [<c05f4e9c>] igmp6_group_added+0x4c/0x1bc
          [<c05f6600>] ipv6_dev_mc_inc+0x398/0x434
          [<c05dba74>] addrconf_dad_work+0x224/0x39c
      
      This happens because CPSW allocates SK buffers and then passes
      pointers on them in CPDMA where they stored in internal CPPI RAM
      (SRAM) which belongs to DEV MMIO space. Kmemleak does not scan IO
      memory and so reports memory leaks.
      
      Hence, mark allocated sk buffers as false positive explicitly.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      254a49d5
    • Lance Richardson's avatar
      vti: flush x-netns xfrm cache when vti interface is removed · a5d0dc81
      Lance Richardson authored
      When executing the script included below, the netns delete operation
      hangs with the following message (repeated at 10 second intervals):
      
        kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
      
      This occurs because a reference to the lo interface in the "secure" netns
      is still held by a dst entry in the xfrm bundle cache in the init netns.
      
      Address this problem by garbage collecting the tunnel netns flow cache
      when a cross-namespace vti interface receives a NETDEV_DOWN notification.
      
      A more detailed description of the problem scenario (referencing commands
      in the script below):
      
      (1) ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1
      
        The vti_test interface is created in the init namespace. vti_tunnel_init()
        attaches a struct ip_tunnel to the vti interface's netdev_priv(dev),
        setting the tunnel net to &init_net.
      
      (2) ip link set vti_test netns secure
      
        The vti_test interface is moved to the "secure" netns. Note that
        the associated struct ip_tunnel still has tunnel->net set to &init_net.
      
      (3) ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1
      
        The first packet sent using the vti device causes xfrm_lookup() to be
        called as follows:
      
            dst = xfrm_lookup(tunnel->net, skb_dst(skb), fl, NULL, 0);
      
        Note that tunnel->net is the init namespace, while skb_dst(skb) references
        the vti_test interface in the "secure" namespace. The returned dst
        references an interface in the init namespace.
      
        Also note that the first parameter to xfrm_lookup() determines which flow
        cache is used to store the computed xfrm bundle, so after xfrm_lookup()
        returns there will be a cached bundle in the init namespace flow cache
        with a dst referencing a device in the "secure" namespace.
      
      (4) ip netns del secure
      
        Kernel begins to delete the "secure" namespace.  At some point the
        vti_test interface is deleted, at which point dst_ifdown() changes
        the dst->dev in the cached xfrm bundle flow from vti_test to lo (still
        in the "secure" namespace however).
        Since nothing has happened to cause the init namespace's flow cache
        to be garbage collected, this dst remains attached to the flow cache,
        so the kernel loops waiting for the last reference to lo to go away.
      
      <Begin script>
      ip link add br1 type bridge
      ip link set dev br1 up
      ip addr add dev br1 1.1.1.1/8
      
      ip netns add secure
      ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1
      ip link set vti_test netns secure
      ip netns exec secure ip link set vti_test up
      ip netns exec secure ip link s lo up
      ip netns exec secure ip addr add dev lo 192.168.100.1/24
      ip netns exec secure ip route add 192.168.200.0/24 dev vti_test
      ip xfrm policy flush
      ip xfrm state flush
      ip xfrm policy add dir out tmpl src 1.1.1.1 dst 1.1.1.2 \
         proto esp mode tunnel mark 1
      ip xfrm policy add dir in tmpl src 1.1.1.2 dst 1.1.1.1 \
         proto esp mode tunnel mark 1
      ip xfrm state add src 1.1.1.1 dst 1.1.1.2 proto esp spi 1 \
         mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788
      ip xfrm state add src 1.1.1.2 dst 1.1.1.1 proto esp spi 1 \
         mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788
      
      ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1
      
      ip netns del secure
      <End script>
      Reported-by: default avatarHangbin Liu <haliu@redhat.com>
      Reported-by: default avatarJan Tluka <jtluka@redhat.com>
      Signed-off-by: default avatarLance Richardson <lrichard@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5d0dc81
    • David S. Miller's avatar
      Merge tag 'rxrpc-fixes-20160809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · aee4eda1
      David S. Miller authored
      David Howells says:
      
      ====================
      rxrpc: Miscellaneous fixes
      
      Here are a bunch of miscellaneous fixes to AF_RXRPC:
      
       (*) Fix an uninitialised pointer.
      
       (*) Fix error handling when we fail to connect a call.
      
       (*) Fix a NULL pointer dereference.
      
       (*) Fix two occasions where a packet is accessed again after being queued
           for someone else to deal with.
      
       (*) Fix a missing skb free.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aee4eda1
    • David Howells's avatar
      rxrpc: Free packets discarded in data_ready · 992c273a
      David Howells authored
      Under certain conditions, the data_ready handler will discard a packet.
      These need to be freed.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      992c273a
    • David Howells's avatar
      rxrpc: Fix a use-after-push in data_ready handler · 50fd85a1
      David Howells authored
      Fix a use of a packet after it has been enqueued onto the packet processing
      queue in the data_ready handler.  Once on a call's Rx queue, we mustn't
      touch it any more as it may be dequeued and freed by the call processor
      running on a work queue.
      
      Save the values we need before enqueuing.
      
      Without this, we can get an oops like the following:
      
      BUG: unable to handle kernel NULL pointer dereference at 000000000000009c
      IP: [<ffffffffa01854e8>] rxrpc_fast_process_packet+0x724/0xa11 [af_rxrpc]
      PGD 0 
      Oops: 0000 [#1] SMP
      Modules linked in: kafs(E) af_rxrpc(E) [last unloaded: af_rxrpc]
      CPU: 2 PID: 0 Comm: swapper/2 Tainted: G            E   4.7.0-fsdevel+ #1336
      Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
      task: ffff88040d6863c0 task.stack: ffff88040d68c000
      RIP: 0010:[<ffffffffa01854e8>]  [<ffffffffa01854e8>] rxrpc_fast_process_packet+0x724/0xa11 [af_rxrpc]
      RSP: 0018:ffff88041fb03a78  EFLAGS: 00010246
      RAX: ffffffffffffffff RBX: ffff8803ff195b00 RCX: 0000000000000001
      RDX: ffffffffa01854d1 RSI: 0000000000000008 RDI: ffff8803ff195b00
      RBP: ffff88041fb03ab0 R08: 0000000000000000 R09: 0000000000000001
      R10: ffff88041fb038c8 R11: 0000000000000000 R12: ffff880406874800
      R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff88041fb00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000000000009c CR3: 0000000001c14000 CR4: 00000000001406e0
      Stack:
       ffff8803ff195ea0 ffff880408348800 ffff880406874800 ffff8803ff195b00
       ffff880408348800 ffff8803ff195ed8 0000000000000000 ffff88041fb03af0
       ffffffffa0186072 0000000000000000 ffff8804054da000 0000000000000000
      Call Trace:
       <IRQ> 
       [<ffffffffa0186072>] rxrpc_data_ready+0x89d/0xbae [af_rxrpc]
       [<ffffffff814c94d7>] __sock_queue_rcv_skb+0x24c/0x2b2
       [<ffffffff8155c59a>] __udp_queue_rcv_skb+0x4b/0x1bd
       [<ffffffff8155e048>] udp_queue_rcv_skb+0x281/0x4db
       [<ffffffff8155ea8f>] __udp4_lib_rcv+0x7ed/0x963
       [<ffffffff8155ef9a>] udp_rcv+0x15/0x17
       [<ffffffff81531d86>] ip_local_deliver_finish+0x1c3/0x318
       [<ffffffff81532544>] ip_local_deliver+0xbb/0xc4
       [<ffffffff81531bc3>] ? inet_del_offload+0x40/0x40
       [<ffffffff815322a9>] ip_rcv_finish+0x3ce/0x42c
       [<ffffffff81532851>] ip_rcv+0x304/0x33d
       [<ffffffff81531edb>] ? ip_local_deliver_finish+0x318/0x318
       [<ffffffff814dff9d>] __netif_receive_skb_core+0x601/0x6e8
       [<ffffffff814e072e>] __netif_receive_skb+0x13/0x54
       [<ffffffff814e082a>] netif_receive_skb_internal+0xbb/0x17c
       [<ffffffff814e1838>] napi_gro_receive+0xf9/0x1bd
       [<ffffffff8144eb9f>] rtl8169_poll+0x32b/0x4a8
       [<ffffffff814e1c7b>] net_rx_action+0xe8/0x357
       [<ffffffff81051074>] __do_softirq+0x1aa/0x414
       [<ffffffff810514ab>] irq_exit+0x3d/0xb0
       [<ffffffff810184a2>] do_IRQ+0xe4/0xfc
       [<ffffffff81612053>] common_interrupt+0x93/0x93
       <EOI> 
       [<ffffffff814af837>] ? cpuidle_enter_state+0x1ad/0x2be
       [<ffffffff814af832>] ? cpuidle_enter_state+0x1a8/0x2be
       [<ffffffff814af96a>] cpuidle_enter+0x12/0x14
       [<ffffffff8108956f>] call_cpuidle+0x39/0x3b
       [<ffffffff81089855>] cpu_startup_entry+0x230/0x35d
       [<ffffffff810312ea>] start_secondary+0xf4/0xf7
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      50fd85a1
    • David Howells's avatar
      rxrpc: Once packet posted in data_ready, don't retry posting · 2e7e9758
      David Howells authored
      Once a packet has been posted to a connection in the data_ready handler, we
      mustn't try reposting if we then find that the connection is dying as the
      refcount has been given over to the dying connection and the packet might
      no longer exist.
      
      Losing the packet isn't a problem as the peer will retransmit.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      2e7e9758
    • David Howells's avatar
      rxrpc: Don't access connection from call if pointer is NULL · f9dc5757
      David Howells authored
      The call state machine processor sets up the message parameters for a UDP
      message that it might need to transmit in advance on the basis that there's
      a very good chance it's going to have to transmit either an ACK or an
      ABORT.  This requires it to look in the connection struct to retrieve some
      of the parameters.
      
      However, if the call is complete, the call connection pointer may be NULL
      to dissuade the processor from transmitting a message.  However, there are
      some situations where the processor is still going to be called - and it's
      still going to set up message parameters whether it needs them or not.
      
      This results in a NULL pointer dereference at:
      
      	net/rxrpc/call_event.c:837
      
      To fix this, skip the message pre-initialisation if there's no connection
      attached.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f9dc5757
    • David Howells's avatar
      rxrpc: Need to flag call as being released on connect failure · 17b963e3
      David Howells authored
      If rxrpc_new_client_call() fails to make a connection, the call record that
      it allocated needs to be marked as RXRPC_CALL_RELEASED before it is passed
      to rxrpc_put_call() to indicate that it no longer has any attachment to the
      AF_RXRPC socket.
      
      Without this, an assertion failure may occur at:
      
      	net/rxrpc/call_object:635
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      17b963e3
    • Arnd Bergmann's avatar
      rxrpc: fix uninitialized pointer dereference in debug code · 55cae7a4
      Arnd Bergmann authored
      A newly added bugfix caused an uninitialized variable to be
      used for printing debug output. This is harmless as long
      as the debug setting is disabled, but otherwise leads to an
      immediate crash.
      
      gcc warns about this when -Wmaybe-uninitialized is enabled:
      
      net/rxrpc/call_object.c: In function 'rxrpc_release_call':
      net/rxrpc/call_object.c:496:163: error: 'sp' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      The initialization was removed but one of the users remains.
      This adds back the initialization.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: 372ee163 ("rxrpc: Fix races between skb free, ACK generation and replying")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      55cae7a4
    • David S. Miller's avatar
      Merge branch 'qed-fixes' · 8971d8e5
      David S. Miller authored
      Sudarsana Reddy Kalluru says:
      
      ====================
      qed: dcbx fix series.
      
      The patch series contains the minor bug fixes for qed dcbx module.
      Please consider applying this to 'net' branch.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8971d8e5
    • Sudarsana Reddy Kalluru's avatar
      qed: Update app count when adding a new dcbx app entry to the table. · 1d7406ce
      Sudarsana Reddy Kalluru authored
      App count is not updated while adding new app entry to the dcbx app table.
      Signed-off-by: default avatarSudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d7406ce
    • Sudarsana Reddy Kalluru's avatar
      qed: Add dcbx app support for IEEE Selection Field. · 59bcb797
      Sudarsana Reddy Kalluru authored
      MFW now supports the Selection field for IEEE mode. Add driver changes to
      use the newer MFW masks to read/write the port-id value.
      Signed-off-by: default avatarSudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59bcb797
    • Sudarsana Reddy Kalluru's avatar
      qed: Use ieee mfw-mask to get ethtype in ieee-dcbx mode. · fb9ea8a9
      Sudarsana Reddy Kalluru authored
      Ethtype value is being read incorrectly in ieee-dcbx mode. Use the
      correct mfw mask value.
      Signed-off-by: default avatarSudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb9ea8a9
    • Sudarsana Reddy Kalluru's avatar
      qed: Remove the endian-ness conversion for pri_to_tc value. · c0c45a6b
      Sudarsana Reddy Kalluru authored
      Endian-ness conversion is not needed for priority-to-TC field as the
      field is already being read/written by the driver in big-endian way.
      Signed-off-by: default avatarSudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
      Signed-off-by: default avatarYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0c45a6b
  2. 08 Aug, 2016 17 commits
  3. 07 Aug, 2016 4 commits
  4. 06 Aug, 2016 3 commits
    • Geert Uytterhoeven's avatar
      net: dsa: b53: Add missing ULL suffix for 64-bit constant · 5e3b724e
      Geert Uytterhoeven authored
      On 32-bit (e.g. with m68k-linux-gnu-gcc-4.1):
      
          drivers/net/dsa/b53/b53_common.c: In function ‘b53_arl_read’:
          drivers/net/dsa/b53/b53_common.c:1072: warning: integer constant is too large for ‘long’ type
      
      Fixes: 1da6df85 ("net: dsa: b53: Implement ARL add/del/dump operations")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e3b724e
    • David Forster's avatar
      ipv4: panic in leaf_walk_rcu due to stale node pointer · 94d9f1c5
      David Forster authored
      Panic occurs when issuing "cat /proc/net/route" whilst
      populating FIB with > 1M routes.
      
      Use of cached node pointer in fib_route_get_idx is unsafe.
      
       BUG: unable to handle kernel paging request at ffffc90001630024
       IP: [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0
       PGD 11b08d067 PUD 11b08e067 PMD dac4b067 PTE 0
       Oops: 0000 [#1] SMP
       Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscac
       snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep virti
       acpi_cpufreq button parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd
      tio_ring virtio floppy uhci_hcd ehci_hcd usbcore usb_common libata scsi_mod
       CPU: 1 PID: 785 Comm: cat Not tainted 4.2.0-rc8+ #4
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
       task: ffff8800da1c0bc0 ti: ffff88011a05c000 task.ti: ffff88011a05c000
       RIP: 0010:[<ffffffff814cf6a0>]  [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0
       RSP: 0018:ffff88011a05fda0  EFLAGS: 00010202
       RAX: ffff8800d8a40c00 RBX: ffff8800da4af940 RCX: ffff88011a05ff20
       RDX: ffffc90001630020 RSI: 0000000001013531 RDI: ffff8800da4af950
       RBP: 0000000000000000 R08: ffff8800da1f9a00 R09: 0000000000000000
       R10: ffff8800db45b7e4 R11: 0000000000000246 R12: ffff8800da4af950
       R13: ffff8800d97a74c0 R14: 0000000000000000 R15: ffff8800d97a7480
       FS:  00007fd3970e0700(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: ffffc90001630024 CR3: 000000011a7e4000 CR4: 00000000000006e0
       Stack:
        ffffffff814d00d3 0000000000000000 ffff88011a05ff20 ffff8800da1f9a00
        ffffffff811dd8b9 0000000000000800 0000000000020000 00007fd396f35000
        ffffffff811f8714 0000000000003431 ffffffff8138dce0 0000000000000f80
       Call Trace:
        [<ffffffff814d00d3>] ? fib_route_seq_start+0x93/0xc0
        [<ffffffff811dd8b9>] ? seq_read+0x149/0x380
        [<ffffffff811f8714>] ? fsnotify+0x3b4/0x500
        [<ffffffff8138dce0>] ? process_echoes+0x70/0x70
        [<ffffffff8121cfa7>] ? proc_reg_read+0x47/0x70
        [<ffffffff811bb823>] ? __vfs_read+0x23/0xd0
        [<ffffffff811bbd42>] ? rw_verify_area+0x52/0xf0
        [<ffffffff811bbe61>] ? vfs_read+0x81/0x120
        [<ffffffff811bcbc2>] ? SyS_read+0x42/0xa0
        [<ffffffff81549ab2>] ? entry_SYSCALL_64_fastpath+0x16/0x75
       Code: 48 85 c0 75 d8 f3 c3 31 c0 c3 f3 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00
      a 04 89 f0 33 02 44 89 c9 48 d3 e8 0f b6 4a 05 49 89
       RIP  [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0
        RSP <ffff88011a05fda0>
       CR2: ffffc90001630024
      Signed-off-by: default avatarDave Forster <dforster@brocade.com>
      Acked-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94d9f1c5
    • David Howells's avatar
      rxrpc: Fix races between skb free, ACK generation and replying · 372ee163
      David Howells authored
      Inside the kafs filesystem it is possible to occasionally have a call
      processed and terminated before we've had a chance to check whether we need
      to clean up the rx queue for that call because afs_send_simple_reply() ends
      the call when it is done, but this is done in a workqueue item that might
      happen to run to completion before afs_deliver_to_call() completes.
      
      Further, it is possible for rxrpc_kernel_send_data() to be called to send a
      reply before the last request-phase data skb is released.  The rxrpc skb
      destructor is where the ACK processing is done and the call state is
      advanced upon release of the last skb.  ACK generation is also deferred to
      a work item because it's possible that the skb destructor is not called in
      a context where kernel_sendmsg() can be invoked.
      
      To this end, the following changes are made:
      
       (1) kernel_rxrpc_data_consumed() is added.  This should be called whenever
           an skb is emptied so as to crank the ACK and call states.  This does
           not release the skb, however.  kernel_rxrpc_free_skb() must now be
           called to achieve that.  These together replace
           rxrpc_kernel_data_delivered().
      
       (2) kernel_rxrpc_data_consumed() is wrapped by afs_data_consumed().
      
           This makes afs_deliver_to_call() easier to work as the skb can simply
           be discarded unconditionally here without trying to work out what the
           return value of the ->deliver() function means.
      
           The ->deliver() functions can, via afs_data_complete(),
           afs_transfer_reply() and afs_extract_data() mark that an skb has been
           consumed (thereby cranking the state) without the need to
           conditionally free the skb to make sure the state is correct on an
           incoming call for when the call processor tries to send the reply.
      
       (3) rxrpc_recvmsg() now has to call kernel_rxrpc_data_consumed() when it
           has finished with a packet and MSG_PEEK isn't set.
      
       (4) rxrpc_packet_destructor() no longer calls rxrpc_hard_ACK_data().
      
           Because of this, we no longer need to clear the destructor and put the
           call before we free the skb in cases where we don't want the ACK/call
           state to be cranked.
      
       (5) The ->deliver() call-type callbacks are made to return -EAGAIN rather
           than 0 if they expect more data (afs_extract_data() returns -EAGAIN to
           the delivery function already), and the caller is now responsible for
           producing an abort if that was the last packet.
      
       (6) There are many bits of unmarshalling code where:
      
       		ret = afs_extract_data(call, skb, last, ...);
      		switch (ret) {
      		case 0:		break;
      		case -EAGAIN:	return 0;
      		default:	return ret;
      		}
      
           is to be found.  As -EAGAIN can now be passed back to the caller, we
           now just return if ret < 0:
      
       		ret = afs_extract_data(call, skb, last, ...);
      		if (ret < 0)
      			return ret;
      
       (7) Checks for trailing data and empty final data packets has been
           consolidated as afs_data_complete().  So:
      
      		if (skb->len > 0)
      			return -EBADMSG;
      		if (!last)
      			return 0;
      
           becomes:
      
      		ret = afs_data_complete(call, skb, last);
      		if (ret < 0)
      			return ret;
      
       (8) afs_transfer_reply() now checks the amount of data it has against the
           amount of data desired and the amount of data in the skb and returns
           an error to induce an abort if we don't get exactly what we want.
      
      Without these changes, the following oops can occasionally be observed,
      particularly if some printks are inserted into the delivery path:
      
      general protection fault: 0000 [#1] SMP
      Modules linked in: kafs(E) af_rxrpc(E) [last unloaded: af_rxrpc]
      CPU: 0 PID: 1305 Comm: kworker/u8:3 Tainted: G            E   4.7.0-fsdevel+ #1303
      Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
      Workqueue: kafsd afs_async_workfn [kafs]
      task: ffff88040be041c0 ti: ffff88040c070000 task.ti: ffff88040c070000
      RIP: 0010:[<ffffffff8108fd3c>]  [<ffffffff8108fd3c>] __lock_acquire+0xcf/0x15a1
      RSP: 0018:ffff88040c073bc0  EFLAGS: 00010002
      RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: ffff88040d29a710
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88040d29a710
      RBP: ffff88040c073c70 R08: 0000000000000001 R09: 0000000000000001
      R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000000 R14: ffff88040be041c0 R15: ffffffff814c928f
      FS:  0000000000000000(0000) GS:ffff88041fa00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fa4595f4750 CR3: 0000000001c14000 CR4: 00000000001406f0
      Stack:
       0000000000000006 000000000be04930 0000000000000000 ffff880400000000
       ffff880400000000 ffffffff8108f847 ffff88040be041c0 ffffffff81050446
       ffff8803fc08a920 ffff8803fc08a958 ffff88040be041c0 ffff88040c073c38
      Call Trace:
       [<ffffffff8108f847>] ? mark_held_locks+0x5e/0x74
       [<ffffffff81050446>] ? __local_bh_enable_ip+0x9b/0xa1
       [<ffffffff8108f9ca>] ? trace_hardirqs_on_caller+0x16d/0x189
       [<ffffffff810915f4>] lock_acquire+0x122/0x1b6
       [<ffffffff810915f4>] ? lock_acquire+0x122/0x1b6
       [<ffffffff814c928f>] ? skb_dequeue+0x18/0x61
       [<ffffffff81609dbf>] _raw_spin_lock_irqsave+0x35/0x49
       [<ffffffff814c928f>] ? skb_dequeue+0x18/0x61
       [<ffffffff814c928f>] skb_dequeue+0x18/0x61
       [<ffffffffa009aa92>] afs_deliver_to_call+0x344/0x39d [kafs]
       [<ffffffffa009ab37>] afs_process_async_call+0x4c/0xd5 [kafs]
       [<ffffffffa0099e9c>] afs_async_workfn+0xe/0x10 [kafs]
       [<ffffffff81063a3a>] process_one_work+0x29d/0x57c
       [<ffffffff81064ac2>] worker_thread+0x24a/0x385
       [<ffffffff81064878>] ? rescuer_thread+0x2d0/0x2d0
       [<ffffffff810696f5>] kthread+0xf3/0xfb
       [<ffffffff8160a6ff>] ret_from_fork+0x1f/0x40
       [<ffffffff81069602>] ? kthread_create_on_node+0x1cf/0x1cf
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      372ee163