- 30 Mar, 2020 4 commits
-
-
Pablo Neira Ayuso authored
The bitmap set does not support for expressions, skip it from the estimation step. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Pablo Neira Ayuso authored
If the global set expression definition mismatches the dynset expression, then bail out. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Pablo Neira Ayuso authored
Otherwise, nft_lookup might dereference an uninitialized pointer to the element extension. Fixes: 665153ff ("netfilter: nf_tables: add bitmap set type") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Romain Bellan authored
When CONFIG_NF_CONNTRACK_MARK is not set, any CTA_MARK or CTA_MARK_MASK in netlink message are not supported. We should return an error when one of them is set, not both Fixes: 9306425b ("netfilter: ctnetlink: must check mark attributes vs NULL") Signed-off-by:
Romain Bellan <romain.bellan@wifirst.fr> Signed-off-by:
Florent Fourcot <florent.fourcot@wifirst.fr> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
- 29 Mar, 2020 4 commits
-
-
Florian Westphal authored
Instead of dropping refs+kfree, use the helper added in previous patch. Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Westphal authored
nf_queue is problematic when another NF_QUEUE invocation happens from nf_reinject(). 1. nf_queue is invoked, increments state->sk refcount. 2. skb is queued, waiting for verdict. 3. sk is closed/released. 3. verdict comes back, nf_reinject is called. 4. nf_reinject drops the reference -- refcount can now drop to 0 Instead of get_ref/release_ref pattern, we need to nest the get_ref calls: get_ref get_ref release_ref release_ref So that when we invoke the next processing stage (another netfilter or the okfn()), we hold at least one reference count on the devices/socket. After previous patch, it is now safe to put the entry even after okfn() has potentially free'd the skb. Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Westphal authored
The refcount is done via entry->skb, which does work fine. Major problem: When putting the refcount of the bridge ports, we must always put the references while the skb is still around. However, we will need to put the references after okfn() to avoid a possible 1 -> 0 -> 1 refcount transition, so we cannot use the skb pointer anymore. Place the physports in the queue entry structure instead to allow for refcounting changes in the next patch. Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Westphal authored
This is a preparation patch, no logical changes. Move free_entry into core and rename it to something more sensible. Will ease followup patches which will complicate the refcount handling. Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
- 27 Mar, 2020 10 commits
-
-
Paul Blakey authored
To allow offload commands to execute in parallel, create workqueue for flow table offload, and use a work entry per offload command. Signed-off-by:
Paul Blakey <paulb@mellanox.com> Reviewed-by:
Oz Shlomo <ozsh@mellanox.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Paul Blakey authored
Currently flow offload threads are synchronized by the flow block mutex. Use rw lock instead to increase flow insertion (read) concurrency. Signed-off-by:
Paul Blakey <paulb@mellanox.com> Reviewed-by:
Oz Shlomo <ozsh@mellanox.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Qian Cai authored
It is safe to traverse &net->nft.tables with &net->nft.commit_mutex held using list_for_each_entry_rcu(). Silence the PROVE_RCU_LIST false positive, WARNING: suspicious RCU usage net/netfilter/nf_tables_api.c:523 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by iptables/1384: #0: ffffffff9745c4a8 (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid+0x25/0x60 [nf_tables] Call Trace: dump_stack+0xa1/0xea lockdep_rcu_suspicious+0x103/0x10d nft_table_lookup.part.0+0x116/0x120 [nf_tables] nf_tables_newtable+0x12c/0x7d0 [nf_tables] nfnetlink_rcv_batch+0x559/0x1190 [nfnetlink] nfnetlink_rcv+0x1da/0x210 [nfnetlink] netlink_unicast+0x306/0x460 netlink_sendmsg+0x44b/0x770 ____sys_sendmsg+0x46b/0x4a0 ___sys_sendmsg+0x138/0x1a0 __sys_sendmsg+0xb6/0x130 __x64_sys_sendmsg+0x48/0x50 do_syscall_64+0x69/0xf4 entry_SYSCALL_64_after_hwframe+0x49/0xb3 Signed-off-by:
Qian Cai <cai@lca.pw> Acked-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
wenxu authored
The indirect block setup should use TC_SETUP_FT as the type instead of TC_SETUP_BLOCK. Adjust existing users of the indirect flow block infrastructure. Fixes: b5140a36 ("netfilter: flowtable: add indr block setup support") Signed-off-by:
wenxu <wenxu@ucloud.cn> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Pablo Neira Ayuso authored
Add a new flag to turn on flowtable counters which are stored in the conntrack entry. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Pablo Neira Ayuso authored
Expose the NFT_FLOWTABLE_HW_OFFLOAD flag through uapi. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Pablo Neira Ayuso authored
This function allows you to update the conntrack counters. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Haishuang Yan authored
After strip GRE/UDP tunnel header for icmp errors, it's better to show "GRE/UDP" instead of "IPIP" in debug message. Signed-off-by:
Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by:
Julian Anastasov <ja@ssi.bg> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Jules Irenge authored
netfilter: conntrack: Add missing annotations for nf_conntrack_all_lock() and nf_conntrack_all_unlock() Sparse reports warnings at nf_conntrack_all_lock() and nf_conntrack_all_unlock() warning: context imbalance in nf_conntrack_all_lock() - wrong count at exit warning: context imbalance in nf_conntrack_all_unlock() - unexpected unlock Add the missing __acquires(&nf_conntrack_locks_all_lock) Add missing __releases(&nf_conntrack_locks_all_lock) Signed-off-by:
Jules Irenge <jbi.octave@gmail.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Jules Irenge authored
Sparse reports a warning at ctnetlink_parse_nat_setup() warning: context imbalance in ctnetlink_parse_nat_setup() - unexpected unlock The root cause is the missing annotation at ctnetlink_parse_nat_setup() Add the missing __must_hold(RCU) annotation Signed-off-by:
Jules Irenge <jbi.octave@gmail.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
- 19 Mar, 2020 8 commits
-
-
wenxu authored
The tc ct action does not cache the route in the flowtable entry. Fixes: 88bf6e41 ("netfilter: flowtable: add tunnel encap/decap action offload support") Fixes: cfab6dbd ("netfilter: flowtable: add tunnel match offload support") Signed-off-by:
wenxu <wenxu@ucloud.cn> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Pablo Neira Ayuso authored
This patch adds nft_set_elem_expr_destroy() to destroy stateful expressions in set elements. This patch also updates the commit path to call this function to invoke expr->ops->destroy_clone when required. This is implicitly fixing up a module reference counter leak and a memory leak in expressions that allocated internal state, e.g. nft_counter. Fixes: 40944452 ("netfilter: nf_tables: add elements with stateful expressions") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Pablo Neira Ayuso authored
After copying the expression to the set element extension, release the expression and reset the pointer to avoid a double-free from the error path. Fixes: 40944452 ("netfilter: nf_tables: add elements with stateful expressions") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Pablo Neira Ayuso authored
This patch allows users to specify the stateful expression for the elements in this set via NFTA_SET_EXPR. This new feature allows you to turn on counters for all of the elements in this set. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Pablo Neira Ayuso authored
The patch that adds support for stateful expressions in set definitions require this. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Pablo Neira Ayuso authored
Move the nft_expr_clone() helper function to the core. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== mlx5-updates-2020-03-17 1) Compiler warnings and cleanup for the connection tracking series 2) Bug fixes for the connection tracking series 3) Fix devlink port register sequence 4) Last five patches in the series, By Eli cohen Add the support for forwarding traffic between two eswitch uplink representors (Hairpin for eswitch), using mlx5 termination tables to change the direction of a packet in hw from RX to TX pipeline. ==================== Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
At least some integrated PHY's in RTL8168/RTL8125 chip versions support downshift, and the actual link speed can be read from a vendor-specific register. Info about this register was provided by Realtek. More details about downshift configuration (e.g. number of attempts) aren't available, therefore the downshift tunable is not implemented. Signed-off-by:
Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 18 Mar, 2020 14 commits
-
-
Petr Machata authored
In the commit referenced below, hw_stats_type of an entry is set for every entry that corresponds to a pedit action. However, the assignment is only done after the entry pointer is bumped, and therefore could overwrite memory outside of the entries array. The reason for this positioning may have been that the current entry's hw_stats_type is already set above, before the action-type dispatch. However, if there are no more actions, the assignment is wrong. And if there are, the next round of the for_each_action loop will make the assignment before the action-type dispatch anyway. Therefore fix this issue by simply reordering the two lines. Fixes: 74522e7b ("net: sched: set the hw_stats_type in pedit loop") Signed-off-by:
Petr Machata <petrm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ido Schimmel says: ==================== mlxsw: spectrum_cnt: Expose counter resources Jiri says: Capacity and utilization of existing flow and RIF counters are currently unavailable to be seen by the user. Use the existing devlink resources API to expose the information: $ sudo devlink resource show pci/0000:00:10.0 -v pci/0000:00:10.0: name kvd resource_path /kvd size 524288 unit entry dpipe_tables none name span_agents resource_path /span_agents size 8 occ 0 unit entry dpipe_tables none name counters resource_path /counters size 79872 occ 44 unit entry dpipe_tables none resources: name flow resource_path /counters/flow size 61440 occ 4 unit entry dpipe_tables none name rif resource_path /counters/rif size 18432 occ 40 unit entry dpipe_tables none ==================== Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Add tests for mlxsw hw_stats types. Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Implement occupancy counting for counters and expose over devlink resource API. Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Put all init operations related to subpools into mlxsw_sp_counter_sub_pools_init(). Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Move the validation of subpools configuration, to avoid possible over commitment to resource registration. Add WARN_ON to indicate bug in the code. Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Implement devlink resources support for counter pools. Move the subpool sizes calculations into the new resources register function. Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Add new field to subpool struct that would indicate which resource id should be used to query the entry size for the subpool from the device. Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Currently, the global static array of subpools is used. Make it per-instance as multiple instances of the mlxsw driver can have different values. Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
With the change that made the code to query counter bank size from device instead of using hard-coded value, the number of available counters changed for Spectrum-2. Adjust the limit in the selftests. Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
The bank size is different between Spectrum versions. Also it is a resource that can be queried. So instead of hard coding the value in code, query it from the firmware. Signed-off-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Rahul Lakkireddy authored
Chelsio NICs have 3 filter regions, in following order of priority: 1. High Priority (HPFILTER) region (Highest Priority). 2. HASH region. 3. Normal FILTER region (Lowest Priority). Currently, there's a 1-to-1 mapping between the prio value passed by TC and the filter region index. However, it's possible to have multiple TC rules with the same prio value. In this case, if a region is exhausted, no attempt is made to try inserting the rule in the next available region. So, rework and remove the 1-to-1 mapping. Instead, dynamically select the region to insert the filter rule, as long as the new rule's prio value doesn't conflict with existing rules across all the 3 regions. Signed-off-by:
Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
This reverts the following commits: 8537f786 ("netfilter: Introduce egress hook") 5418d388 ("netfilter: Generalize ingress hook") b030f194 ("netfilter: Rename ingress hook include file") >From the discussion in [0], the author's main motivation to add a hook in fast path is for an out of tree kernel module, which is a red flag to begin with. Other mentioned potential use cases like NAT{64,46} is on future extensions w/o concrete code in the tree yet. Revert as suggested [1] given the weak justification to add more hooks to critical fast-path. [0] https://lore.kernel.org/netdev/cover.1583927267.git.lukas@wunner.de/ [1] https://lore.kernel.org/netdev/20200318.011152.72770718915606186.davem@davemloft.net/Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Cc: David Miller <davem@davemloft.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Alexei Starovoitov <ast@kernel.org> Nacked-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Julian Wiedmann says: ==================== s390/qeth: updates 2020-03-18 please apply the following patch series for qeth to netdev's net-next tree. This consists of three parts: 1) support for __GFP_MEMALLOC, 2) several ethtool enhancements (.set_channels, SW Timestamping), 3) the usual cleanups. ==================== Signed-off-by:
David S. Miller <davem@davemloft.net>
-