1. 26 Jan, 2017 10 commits
  2. 25 Jan, 2017 14 commits
    • Florian Fainelli's avatar
      net: dsa: Bring back device detaching in dsa_slave_suspend() · f154be24
      Florian Fainelli authored
      Commit 448b4482 ("net: dsa: Add lockdep class to tx queues to avoid
      lockdep splat") removed the netif_device_detach() call done in
      dsa_slave_suspend() which is necessary, and paired with a corresponding
      netif_device_attach(), bring it back.
      
      Fixes: 448b4482 ("net: dsa: Add lockdep class to tx queues to avoid lockdep splat")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f154be24
    • David S. Miller's avatar
      Merge branch 'phy-truncated-led-names' · d5bdc021
      David S. Miller authored
      Geert Uytterhoeven says:
      
      ====================
      net: phy: leds: Fix truncated LED trigger names and crashes
      
      I started seeing crashes during s2ram and poweroff on all my ARM boards,
      like:
      
          Unable to handle kernel NULL pointer dereference at virtual address 00000000
          ...
          [<c04116d4>] (__list_del_entry_valid) from [<c05e8948>] (led_trigger_unregister+0x34/0xcc)
          [<c05e8948>] (led_trigger_unregister) from [<c05336c4>] (phy_led_triggers_unregister+0x28/0x34)
          [<c05336c4>] (phy_led_triggers_unregister) from [<c0531d44>] (phy_detach+0x30/0x74)
          [<c0531d44>] (phy_detach) from [<c0538bdc>] (sh_eth_close+0x64/0x9c)
          [<c0538bdc>] (sh_eth_close) from [<c04d4ce0>] (dpm_run_callback+0x48/0xc8)
      
      or:
      
          list_del corruption. prev->next should be dede6540, but was 2e323931
          ------------[ cut here ]------------
          kernel BUG at lib/list_debug.c:52!
          ...
          [<c02f6d70>] (__list_del_entry_valid) from [<c0425168>] (led_trigger_unregister+0x34/0xcc)
          [<c0425168>] (led_trigger_unregister) from [<c03a05a0>] (phy_led_triggers_unregister+0x28/0x34)
          [<c03a05a0>] (phy_led_triggers_unregister) from [<c039ec04>] (phy_detach+0x30/0x74)
          [<c039ec04>] (phy_detach) from [<c03a4fc0>] (sh_eth_close+0x6c/0xa4)
          [<c03a4fc0>] (sh_eth_close) from [<c0483234>] (__dev_close_many+0xac/0xd0)
      
      As the only clue was a kernel message like
      
          sh-eth ee700000.ethernet eth0: No phy led trigger registered for speed(100)
      
      I had to bisected this, leading to commit 4567d686 ("phy:
      increase size of MII_BUS_ID_SIZE and bus_id").  Reverting that commit
      fixed the issue.
      
      More investigation revealed the crashes are due to the combination of
      two things:
        - Truncated LED trigger names, leading to duplicate names, and
          registration failures,
        - Bad error handling in case of registration failures.
      
      Both are fixed by this patch series.
      
      Changes compared to v1:
        - Add Reviewed-by,
        - New patch "net: phy: leds: Break dependency of phy.h on
          phy_led_triggers.h",
        - Drop moving the include of <linux/phy_led_triggers.h>, as
          <linux/phy.h> no longer includes it,
        - #include <linux/phy.h> from <linux/phy_led_triggers.h>.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5bdc021
    • Geert Uytterhoeven's avatar
      net: phy: leds: Fix truncated LED trigger names · 3c880eb0
      Geert Uytterhoeven authored
      Commit 4567d686 ("phy: increase size of MII_BUS_ID_SIZE and
      bus_id") increased the size of MII bus IDs, but forgot to update the
      private definition in <linux/phy_led_triggers.h>.
      This may cause:
        1. Truncation of LED trigger names,
        2. Duplicate LED trigger names,
        3. Failures registering LED triggers,
        4. Crashes due to bad error handling in the LED trigger failure path.
      
      To fix this, and prevent the definitions going out of sync again in the
      future, let the PHY LED trigger code use the existing MII_BUS_ID_SIZE
      definition.
      
      Example:
        - Before I had triggers "ee700000.etherne:01:100Mbps" and
          "ee700000.etherne:01:10Mbps",
        - After the increase of MII_BUS_ID_SIZE, both became
          "ee700000.ethernet-ffffffff:01:" => FAIL,
        - Now, the triggers are "ee700000.ethernet-ffffffff:01:100Mbps" and
          "ee700000.ethernet-ffffffff:01:10Mbps", which are unique again.
      
      Fixes: 4567d686 ("phy: increase size of MII_BUS_ID_SIZE and bus_id")
      Fixes: 2e0bc452 ("net: phy: leds: add support for led triggers on phy link state change")
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c880eb0
    • Geert Uytterhoeven's avatar
      net: phy: leds: Break dependency of phy.h on phy_led_triggers.h · d6f8cfa3
      Geert Uytterhoeven authored
      <linux/phy.h> includes <linux/phy_led_triggers.h>, which is not really
      needed.  Drop the include from <linux/phy.h>, and add it to all users
      that didn't include it explicitly.
      Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6f8cfa3
    • Geert Uytterhoeven's avatar
      net: phy: leds: Clear phy_num_led_triggers on failure to avoid crash · 8a87fca8
      Geert Uytterhoeven authored
      phy_attach_direct() ignores errors returned by
      phy_led_triggers_register(). I think that's OK, as LED triggers can be
      considered a non-critical feature.
      
      However, this causes problems later:
        - phy_led_trigger_change_speed() will access the array
          phy_device.phy_led_triggers, which has been freed in the error path
          of phy_led_triggers_register(), which may lead to a crash.
      
        - phy_led_triggers_unregister() will access the same array, leading to
          crashes during s2ram or poweroff, like:
      
      	Unable to handle kernel NULL pointer dereference at virtual address
      	00000000
      	...
      	[<c04116d4>] (__list_del_entry_valid) from [<c05e8948>] (led_trigger_unregister+0x34/0xcc)
      	[<c05e8948>] (led_trigger_unregister) from [<c05336c4>] (phy_led_triggers_unregister+0x28/0x34)
      	[<c05336c4>] (phy_led_triggers_unregister) from [<c0531d44>] (phy_detach+0x30/0x74)
      	[<c0531d44>] (phy_detach) from [<c0538bdc>] (sh_eth_close+0x64/0x9c)
      	[<c0538bdc>] (sh_eth_close) from [<c04d4ce0>] (dpm_run_callback+0x48/0xc8)
      
          or:
      
      	list_del corruption. prev->next should be dede6540, but was 2e323931
      	------------[ cut here ]------------
      	kernel BUG at lib/list_debug.c:52!
      	...
      	[<c02f6d70>] (__list_del_entry_valid) from [<c0425168>] (led_trigger_unregister+0x34/0xcc)
      	[<c0425168>] (led_trigger_unregister) from [<c03a05a0>] (phy_led_triggers_unregister+0x28/0x34)
      	[<c03a05a0>] (phy_led_triggers_unregister) from [<c039ec04>] (phy_detach+0x30/0x74)
      	[<c039ec04>] (phy_detach) from [<c03a4fc0>] (sh_eth_close+0x6c/0xa4)
      	[<c03a4fc0>] (sh_eth_close) from [<c0483234>] (__dev_close_many+0xac/0xd0)
      
      To fix this, clear phy_device.phy_num_led_triggers in the error path of
      phy_led_triggers_register() fails.
      
      Note that the "No phy led trigger registered for speed" message will
      still be printed on link speed changes, which is a good cue that
      something went wrong with the LED triggers.
      
      Fixes: 2e0bc452 ("net: phy: leds: add support for led triggers on phy link state change")
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a87fca8
    • John Crispin's avatar
      net-next: ethernet: mediatek: change the compatible string · 8b901f6b
      John Crispin authored
      When the binding was defined, I was not aware that mt2701 was an earlier
      version of the SoC. For sake of consistency, the ethernet driver should
      use mt2701 inside the compat string as this is the earliest SoC with the
      ethernet core.
      
      The ethernet driver is currently of no real use until we finish and
      upstream the DSA driver. There are no users of this binding yet. It should
      be safe to fix this now before it is too late and we need to provide
      backward compatibility for the mt7623-eth compat string.
      Reported-by: default avatarSean Wang <sean.wang@mediatek.com>
      Signed-off-by: default avatarJohn Crispin <john@phrozen.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8b901f6b
    • John Crispin's avatar
      Documentation: devicetree: change the mediatek ethernet compatible string · 61976fff
      John Crispin authored
      When the binding was defined, I was not aware that mt2701 was an earlier
      version of the SoC. For sake of consistency, the ethernet driver should
      use mt2701 inside the compat string as this is the earliest SoC with the
      ethernet core.
      
      The ethernet driver is currently of no real use until we finish and
      upstream the DSA driver. There are no users of this binding yet. It should
      be safe to fix this now before it is too late and we need to provide
      backward compatibility for the mt7623-eth compat string.
      Reported-by: default avatarSean Wang <sean.wang@mediatek.com>
      Signed-off-by: default avatarJohn Crispin <john@phrozen.org>
      Reviewed-by: default avatarMatthias Brugger <matthias.bgg@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61976fff
    • David S. Miller's avatar
      Merge branch 'bnxt_en-rtnl-fixes' · c0d9665f
      David S. Miller authored
      Michael Chan says:
      
      ====================
      bnxt_en: Fix RTNL lock usage in bnxt_sp_task().
      
      There are 2 function calls from bnxt_sp_task() that have buggy RTNL
      usage.  These 2 functions take RTNL lock under some conditions, but
      some callers (such as open, ethtool) have already taken RTNL.  These
      3 patches fix the issue by making it clear that callers must take
      RTNL.  If the caller is bnxt_sp_task() which does not automatically
      take RTNL, we add a common scheme for bnxt_sp_task() to call these
      functions properly under RTNL.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0d9665f
    • Michael Chan's avatar
      bnxt_en: Fix RTNL lock usage on bnxt_get_port_module_status(). · 90c694bb
      Michael Chan authored
      bnxt_get_port_module_status() calls bnxt_update_link() which expects
      RTNL to be held.  In bnxt_sp_task() that does not hold RTNL, we need to
      call it with a prior call to bnxt_rtnl_lock_sp() and the call needs to
      be moved to the end of bnxt_sp_task().
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90c694bb
    • Michael Chan's avatar
      bnxt_en: Fix RTNL lock usage on bnxt_update_link(). · 0eaa24b9
      Michael Chan authored
      bnxt_update_link() is called from multiple code paths.  Most callers,
      such as open, ethtool, already hold RTNL.  Only the caller bnxt_sp_task()
      does not.  So it is a bug to take RTNL inside bnxt_update_link().
      
      Fix it by removing the RTNL inside bnxt_update_link().  The function
      now expects the caller to always hold RTNL.
      
      In bnxt_sp_task(), call bnxt_rtnl_lock_sp() before calling
      bnxt_update_link().  We also need to move the call to the end of
      bnxt_sp_task() since it will be clearing the BNXT_STATE_IN_SP_TASK bit.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0eaa24b9
    • Michael Chan's avatar
      bnxt_en: Fix bnxt_reset() in the slow path task. · a551ee94
      Michael Chan authored
      In bnxt_sp_task(), we set a bit BNXT_STATE_IN_SP_TASK so that bnxt_close()
      will synchronize and wait for bnxt_sp_task() to finish.  Some functions
      in bnxt_sp_task() require us to clear BNXT_STATE_IN_SP_TASK and then
      acquire rtnl_lock() to prevent race conditions.
      
      There are some bugs related to this logic. This patch refactors the code
      to have common bnxt_rtnl_lock_sp() and bnxt_rtnl_unlock_sp() to handle
      the RTNL and the clearing/setting of the bit.  Multiple functions will
      need the same logic.  We also need to move bnxt_reset() to the end of
      bnxt_sp_task().  Functions that clear BNXT_STATE_IN_SP_TASK must be the
      last functions to be called in bnxt_sp_task().  The common scheme will
      handle the condition properly.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a551ee94
    • Jason Baron's avatar
      tcp: correct memory barrier usage in tcp_check_space() · 56d80622
      Jason Baron authored
      sock_reset_flag() maps to __clear_bit() not the atomic version clear_bit().
      Thus, we need smp_mb(), smp_mb__after_atomic() is not sufficient.
      
      Fixes: 3c715127 ("tcp: add memory barriers to write space paths")
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarJason Baron <jbaron@akamai.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56d80622
    • Xin Long's avatar
      sctp: sctp gso should set feature with NETIF_F_SG when calling skb_segment · 5207f399
      Xin Long authored
      Now sctp gso puts segments into skb's frag_list, then processes these
      segments in skb_segment. But skb_segment handles them only when gs is
      enabled, as it's in the same branch with skb's frags.
      
      Although almost all the NICs support sg other than some old ones, but
      since commit 1e16aa3d ("net: gso: use feature flag argument in all
      protocol gso handlers"), features &= skb->dev->hw_enc_features, and
      xfrm_output_gso call skb_segment with features = 0, which means sctp
      gso would call skb_segment with sg = 0, and skb_segment would not work
      as expected.
      
      This patch is to fix it by setting features param with NETIF_F_SG when
      calling skb_segment so that it can go the right branch to process the
      skb's frag_list.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5207f399
    • Xin Long's avatar
      sctp: sctp_addr_id2transport should verify the addr before looking up assoc · 6f29a130
      Xin Long authored
      sctp_addr_id2transport is a function for sockopt to look up assoc by
      address. As the address is from userspace, it can be a v4-mapped v6
      address. But in sctp protocol stack, it always handles a v4-mapped
      v6 address as a v4 address. So it's necessary to convert it to a v4
      address before looking up assoc by address.
      
      This patch is to fix it by calling sctp_verify_addr in which it can do
      this conversion before calling sctp_endpoint_lookup_assoc, just like
      what sctp_sendmsg and __sctp_connect do for the address from users.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6f29a130
  3. 24 Jan, 2017 16 commits
    • David S. Miller's avatar
      Merge branch 'lwt-module-unload' · ec221a17
      David S. Miller authored
      Robert Shearman says:
      
      ====================
      net: Fix oops on state free after lwt module unload
      
      An oops is seen in lwtstate_free after an lwt ops module has been
      unloaded. This patchset fixes this by preventing modules implementing
      lwtunnel ops from being unloaded whilst there's state alive using
      those ops. The first patch adds fills in a new owner field in all lwt
      ops and the second patch makes use of this to reference count the
      modules as state is built and destroyed using them.
      
      Changes in v3:
       - don't put module reference if try_module_get fails on building state
      
      Changes in v2:
       - specify module owner for all modules as suggested by DaveM
       - reference count all modules building lwt state, not just those ops
         implementing destroy_state, as also suggested by DaveM.
       - rebased on top of David Ahern's lwtunnel changes
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec221a17
    • Robert Shearman's avatar
      lwtunnel: Fix oops on state free after encap module unload · 85c81401
      Robert Shearman authored
      When attempting to free lwtunnel state after the module for the encap
      has been unloaded an oops occurs:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      IP: lwtstate_free+0x18/0x40
      [..]
      task: ffff88003e372380 task.stack: ffffc900001fc000
      RIP: 0010:lwtstate_free+0x18/0x40
      RSP: 0018:ffff88003fd83e88 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff88002bbb3380 RCX: ffff88000c91a300
      [..]
      Call Trace:
       <IRQ>
       free_fib_info_rcu+0x195/0x1a0
       ? rt_fibinfo_free+0x50/0x50
       rcu_process_callbacks+0x2d3/0x850
       ? rcu_process_callbacks+0x296/0x850
       __do_softirq+0xe4/0x4cb
       irq_exit+0xb0/0xc0
       smp_apic_timer_interrupt+0x3d/0x50
       apic_timer_interrupt+0x93/0xa0
      [..]
      Code: e8 6e c6 fc ff 89 d8 5b 5d c3 bb de ff ff ff eb f4 66 90 66 66 66 66 90 55 48 89 e5 53 0f b7 07 48 89 fb 48 8b 04 c5 00 81 d5 81 <48> 8b 40 08 48 85 c0 74 13 ff d0 48 8d 7b 20 be 20 00 00 00 e8
      
      The problem is after the module for the encap can be unloaded the
      corresponding ops is removed and is thus NULL here.
      
      Modules implementing lwtunnel ops should not be allowed to unload
      while there is state alive using those ops, so grab the module
      reference for the ops on creating lwtunnel state and of course release
      the reference when freeing the state.
      
      Fixes: 1104d9ba ("lwtunnel: Add destroy state operation")
      Signed-off-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      85c81401
    • Robert Shearman's avatar
      net: Specify the owning module for lwtunnel ops · 88ff7334
      Robert Shearman authored
      Modules implementing lwtunnel ops should not be allowed to unload
      while there is state alive using those ops, so specify the owning
      module for all lwtunnel ops.
      Signed-off-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88ff7334
    • David S. Miller's avatar
      Merge branch 'tipc-topology-fixes' · 04d7f1fb
      David S. Miller authored
      Parthasarathy Bhuvaragan says:
      
      ====================
      tipc: topology server fixes for nametable soft lockup
      
      In this series, we revert the commit 333f7962 ("tipc: fix a
      race condition leading to subscriber refcnt bug") and provide an
      alternate solution to fix the race conditions in commits 2-4.
      
      We have to do this as the above commit introduced a nametbl soft
      lockup at module exit as described by patch#4.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04d7f1fb
    • Parthasarathy Bhuvaragan's avatar
      tipc: fix cleanup at module unload · 35e22e49
      Parthasarathy Bhuvaragan authored
      In tipc_server_stop(), we iterate over the connections with limiting
      factor as server's idr_in_use. We ignore the fact that this variable
      is decremented in tipc_close_conn(), leading to premature exit.
      
      In this commit, we iterate until the we have no connections left.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Tested-by: default avatarJohn Thompson <thompa.atl@gmail.com>
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35e22e49
    • Parthasarathy Bhuvaragan's avatar
      tipc: ignore requests when the connection state is not CONNECTED · 4c887aa6
      Parthasarathy Bhuvaragan authored
      In tipc_conn_sendmsg(), we first queue the request to the outqueue
      followed by the connection state check. If the connection is not
      connected, we should not queue this message.
      
      In this commit, we reject the messages if the connection state is
      not CF_CONNECTED.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Tested-by: default avatarJohn Thompson <thompa.atl@gmail.com>
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c887aa6
    • Parthasarathy Bhuvaragan's avatar
      tipc: fix nametbl_lock soft lockup at module exit · 9dc3abdd
      Parthasarathy Bhuvaragan authored
      Commit 333f7962 ("tipc: fix a race condition leading to
      subscriber refcnt bug") reveals a soft lockup while acquiring
      nametbl_lock.
      
      Before commit 333f7962, we call tipc_conn_shutdown() from
      tipc_close_conn() in the context of tipc_topsrv_stop(). In that
      context, we are allowed to grab the nametbl_lock.
      
      Commit 333f7962, moved tipc_conn_release (renamed from
      tipc_conn_shutdown) to the connection refcount cleanup. This allows
      either tipc_nametbl_withdraw() or tipc_topsrv_stop() to the cleanup.
      
      Since tipc_exit_net() first calls tipc_topsrv_stop() and then
      tipc_nametble_withdraw() increases the chances for the later to
      perform the connection cleanup.
      
      The soft lockup occurs in the call chain of tipc_nametbl_withdraw(),
      when it performs the tipc_conn_kref_release() as it tries to grab
      nametbl_lock again while holding it already.
      tipc_nametbl_withdraw() grabs nametbl_lock
        tipc_nametbl_remove_publ()
          tipc_subscrp_report_overlap()
            tipc_subscrp_send_event()
              tipc_conn_sendmsg()
                << if (con->flags != CF_CONNECTED) we do conn_put(),
                   triggering the cleanup as refcount=0. >>
                tipc_conn_kref_release
                  tipc_sock_release
                    tipc_conn_release
                      tipc_subscrb_delete
                        tipc_subscrp_delete
                          tipc_nametbl_unsubscribe << Soft Lockup >>
      
      The previous changes in this series fixes the race conditions fixed
      by commit 333f7962. Hence we can now revert the commit.
      
      Fixes: 333f7962 ("tipc: fix a race condition leading to subscriber refcnt bug")
      Reported-and-Tested-by: default avatarJohn Thompson <thompa.atl@gmail.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9dc3abdd
    • Parthasarathy Bhuvaragan's avatar
      tipc: fix connection refcount error · fc0adfc8
      Parthasarathy Bhuvaragan authored
      Until now, the generic server framework maintains the connection
      id's per subscriber in server's conn_idr. At tipc_close_conn, we
      remove the connection id from the server list, but the connection is
      valid until we call the refcount cleanup. Hence we have a window
      where the server allocates the same connection to an new subscriber
      leading to inconsistent reference count. We have another refcount
      warning we grab the refcount in tipc_conn_lookup() for connections
      with flag with CF_CONNECTED not set. This usually occurs at shutdown
      when the we stop the topology server and withdraw TIPC_CFG_SRV
      publication thereby triggering a withdraw message to subscribers.
      
      In this commit, we:
      1. remove the connection from the server list at recount cleanup.
      2. grab the refcount for a connection only if CF_CONNECTED is set.
      Tested-by: default avatarJohn Thompson <thompa.atl@gmail.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc0adfc8
    • Parthasarathy Bhuvaragan's avatar
      tipc: add subscription refcount to avoid invalid delete · d094c4d5
      Parthasarathy Bhuvaragan authored
      Until now, the subscribers keep track of the subscriptions using
      reference count at subscriber level. At subscription cancel or
      subscriber delete, we delete the subscription only if the timer
      was pending for the subscription. This approach is incorrect as:
      1. del_timer() is not SMP safe, if on CPU0 the check for pending
         timer returns true but CPU1 might schedule the timer callback
         thereby deleting the subscription. Thus when CPU0 is scheduled,
         it deletes an invalid subscription.
      2. We export tipc_subscrp_report_overlap(), which accesses the
         subscription pointer multiple times. Meanwhile the subscription
         timer can expire thereby freeing the subscription and we might
         continue to access the subscription pointer leading to memory
         violations.
      
      In this commit, we introduce subscription refcount to avoid deleting
      an invalid subscription.
      Reported-and-Tested-by: default avatarJohn Thompson <thompa.atl@gmail.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d094c4d5
    • Parthasarathy Bhuvaragan's avatar
      tipc: fix nametbl_lock soft lockup at node/link events · 93f955aa
      Parthasarathy Bhuvaragan authored
      We trigger a soft lockup as we grab nametbl_lock twice if the node
      has a pending node up/down or link up/down event while:
      - we process an incoming named message in tipc_named_rcv() and
        perform an tipc_update_nametbl().
      - we have pending backlog items in the name distributor queue
        during a nametable update using tipc_nametbl_publish() or
        tipc_nametbl_withdraw().
      
      The following are the call chain associated:
      tipc_named_rcv() Grabs nametbl_lock
         tipc_update_nametbl() (publish/withdraw)
           tipc_node_subscribe()/unsubscribe()
             tipc_node_write_unlock()
                << lockup occurs if an outstanding node/link event
                   exits, as we grabs nametbl_lock again >>
      
      tipc_nametbl_withdraw() Grab nametbl_lock
        tipc_named_process_backlog()
          tipc_update_nametbl()
            << rest as above >>
      
      The function tipc_node_write_unlock(), in addition to releasing the
      lock processes the outstanding node/link up/down events. To do this,
      we need to grab the nametbl_lock again leading to the lockup.
      
      In this commit we fix the soft lockup by introducing a fast variant of
      node_unlock(), where we just release the lock. We adapt the
      node_subscribe()/node_unsubscribe() to use the fast variants.
      Reported-and-Tested-by: default avatarJohn Thompson <thompa.atl@gmail.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93f955aa
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: bump set->ndeact on set flush · b2c11e4b
      Pablo Neira Ayuso authored
      Add missing set->ndeact update on each deactivated element from the set
      flush path. Otherwise, sets with fixed size break after flush since
      accounting breaks.
      
       # nft add set x y { type ipv4_addr\; size 2\; }
       # nft add element x y { 1.1.1.1 }
       # nft add element x y { 1.1.1.2 }
       # nft flush set x y
       # nft add element x y { 1.1.1.1 }
       <cmdline>:1:1-28: Error: Could not process rule: Too many open files in system
      
      Fixes: 8411b644 ("netfilter: nf_tables: support for set flushing")
      Reported-by: default avatarElise Lennion <elise.lennion@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      b2c11e4b
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: deconstify walk callback function · de70185d
      Pablo Neira Ayuso authored
      The flush operation needs to modify set and element objects, so let's
      deconstify this.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      de70185d
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: fix set->nelems counting with no NLM_F_EXCL · 35d0ac90
      Pablo Neira Ayuso authored
      If the element exists and no NLM_F_EXCL is specified, do not bump
      set->nelems, otherwise we leak one set element slot. This problem
      amplifies if the set is full since the abort path always decrements the
      counter for the -ENFILE case too, giving one spare extra slot.
      
      Fix this by moving set->nelems update to nft_add_set_elem() after
      successful element insertion. Moreover, remove the element if the set is
      full so there is no need to rely on the abort path to undo things
      anymore.
      
      Fixes: c016c7e4 ("netfilter: nf_tables: honor NLM_F_EXCL flag in set element insertion")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      35d0ac90
    • Liping Zhang's avatar
      netfilter: nft_log: restrict the log prefix length to 127 · 5ce6b04c
      Liping Zhang authored
      First, log prefix will be truncated to NF_LOG_PREFIXLEN-1, i.e. 127,
      at nf_log_packet(), so the extra part is useless.
      
      Second, after adding a log rule with a very very long prefix, we will
      fail to dump the nft rules after this _special_ one, but acctually,
      they do exist. For example:
        # name_65000=$(printf "%0.sQ" {1..65000})
        # nft add rule filter output log prefix "$name_65000"
        # nft add rule filter output counter
        # nft add rule filter output counter
        # nft list chain filter output
        table ip filter {
            chain output {
                type filter hook output priority 0; policy accept;
            }
        }
      
      So now, restrict the log prefix length to NF_LOG_PREFIXLEN-1.
      
      Fixes: 96518518 ("netfilter: add nftables")
      Signed-off-by: default avatarLiping Zhang <zlpnobody@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      5ce6b04c
    • David S. Miller's avatar
      Merge branch 'alx-mq-fixes' · 294628c1
      David S. Miller authored
      Tobias Regnery says:
      
      ====================
      alx: fix fallout from multi queue conversion
      
      Here are 3 fixes for the multi queue conversion in v4.10.
      
      The first patch fixes a wrong condition in an if statement.
      
      Patches 2 and 3 fixes regressions in the corner case when requesting msi-x
      interrupts fails and we fall back to msi or legacy interrupts.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      294628c1
    • Tobias Regnery's avatar
      alx: work around hardware bug in interrupt fallback path · 185aceef
      Tobias Regnery authored
      If requesting msi-x interrupts fails in alx_request_irq we fall back to
      a single tx queue and msi or legacy interrupts.
      
      Currently the adapter stops working in this case and we get tx watchdog
      timeouts. For reasons unknown the adapter gets confused when we load the
      dma adresses to the chip in alx_init_ring_ptrs twice: the first time with
      multiple queues and the second time in the fallback case with a single
      queue.
      
      To fix this move the the call to alx_reinit_rings (which calls
      alx_init_ring_ptrs) after alx_request_irq. At this time it is clear how
      much tx queues we have and which dma addresses we use.
      
      Fixes: d768319c ("alx: enable multiple tx queues")
      Signed-off-by: default avatarTobias Regnery <tobias.regnery@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      185aceef