Merge branch 'Remove-rtnl-lock-dependency-from-all-action-implementations'

Vlad Buslov says: ==================== Remove rtnl lock dependency from all action implementations Currently, all netlink protocol handlers for updating rules, actions and qdiscs are protected with single global rtnl lock which removes any possibility for parallelism. This patch set is a second step to remove rtnl lock dependency from TC rules update path. Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added. Handlers registered with this flag are called without RTNL taken. End goal is to have rule update handlers(RTM_NEWTFILTER, RTM_DELTFILTER, etc.) to be registered with UNLOCKED flag to allow parallel execution. However, there is no intention to completely remove or split rtnl lock itself. This patch set addresses specific problems in implementation of tc actions that prevent their control path from being executed concurrently. Additional changes are required to refactor classifiers API and individual classifiers for parallel execution. This patch set lays groundwork to eventually register rule update handlers as rtnl-unlocked. Action API is already prepared for parallel execution with previous patch set, which means that action ops that use action API for their implementation do not require additional modifications. (delete, search, etc.) Action API implements concurrency-safe reference counting and guarantees that cleanup/delete is called only once, after last reference to action is released. The goal of this change is to update specific actions APIs that access action private state directly, in order to be independent from external locking. General approach is to re-use existing tcf_lock spinlock (used by some action implementation to synchronize control path with data path) to protect action private state from concurrent modification. If action has rcu-protected pointer, tcf spinlock is used to protect its update code, instead of relying on rtnl lock. Some actions need to determine rtnl mutex status in order to release it. For example, ife action can load additional kernel modules(meta ops) and must make sure that no locks are held during module load. In such cases 'rtnl_held' argument is used to conditionally release rtnl mutex. Changes from V1 to V2: - Patch 12: - new patch - Patch 14: - refactor gen_new_estimator() to reuse stats_lock when re-assigning rate estimator statistics pointer - Remove mirred and tunnel_key helper function changes. (to be submitted and standalone patch) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'Remove-rtnl-lock-dependency-from-all-action-implementations'
Vlad Buslov says: ==================== Remove rtnl lock dependency from all action implementations Currently, all netlink protocol handlers for updating rules, actions and qdiscs are protected with single global rtnl lock which removes any possibility for parallelism. This patch set is a second step to remove rtnl lock dependency from TC rules update path. Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added. Handlers registered with this flag are called without RTNL taken. End goal is to have rule update handlers(RTM_NEWTFILTER, RTM_DELTFILTER, etc.) to be registered with UNLOCKED flag to allow parallel execution. However, there is no intention to completely remove or split rtnl lock itself. This patch set addresses specific problems in implementation of tc actions that prevent their control path from being executed concurrently. Additional changes are required to refactor classifiers API and individual classifiers for parallel execution. This patch set lays groundwork to eventually register rule update handlers as rtnl-unlocked. Action API is already prepared for parallel execution with previous patch set, which means that action ops that use action API for their implementation do not require additional modifications. (delete, search, etc.) Action API implements concurrency-safe reference counting and guarantees that cleanup/delete is called only once, after last reference to action is released. The goal of this change is to update specific actions APIs that access action private state directly, in order to be independent from external locking. General approach is to re-use existing tcf_lock spinlock (used by some action implementation to synchronize control path with data path) to protect action private state from concurrent modification. If action has rcu-protected pointer, tcf spinlock is used to protect its update code, instead of relying on rtnl lock. Some actions need to determine rtnl mutex status in order to release it. For example, ife action can load additional kernel modules(meta ops) and must make sure that no locks are held during module load. In such cases 'rtnl_held' argument is used to conditionally release rtnl mutex. Changes from V1 to V2: - Patch 12: - new patch - Patch 14: - refactor gen_new_estimator() to reuse stats_lock when re-assigning rate estimator statistics pointer - Remove mirred and tunnel_key helper function changes. (to be submitted and standalone patch) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
9a95d9c6 · David S. Miller · 2b14e1ea · e329bc42 · 9a95d9c6 · 9a95d9c6
Commit 9a95d9c6 authored Aug 11, 2018 by David S. Miller
17 changed files
--- a/include/net/act_api.h
+++ b/include/net/act_api.h
@@ -101,6 +101,7 @@ struct tc_action_ops {
 	void	(*stats_update)(struct tc_action *, u64, u32, u64);
 	size_t  (*get_fill_size)(const struct tc_action *act);
 	struct net_device *(*get_dev)(const struct tc_action *a);
+	void	(*put_dev)(struct net_device *dev);
 	int     (*delete)(struct net *net, u32 index);
 };


--- a/include/net/gen_stats.h
+++ b/include/net/gen_stats.h
@@ -59,13 +59,13 @@ int gnet_stats_finish_copy(struct gnet_dump *d);
 int gen_new_estimator(struct gnet_stats_basic_packed *bstats,
 		      struct gnet_stats_basic_cpu __percpu *cpu_bstats,
 		      struct net_rate_estimator __rcu **rate_est,
-		      spinlock_t *stats_lock,
+		      spinlock_t *lock,
 		      seqcount_t *running, struct nlattr *opt);
 void gen_kill_estimator(struct net_rate_estimator __rcu **ptr);
 int gen_replace_estimator(struct gnet_stats_basic_packed *bstats,
 			  struct gnet_stats_basic_cpu __percpu *cpu_bstats,
 			  struct net_rate_estimator __rcu **ptr,
-			  spinlock_t *stats_lock,
+			  spinlock_t *lock,
 			  seqcount_t *running, struct nlattr *opt);
 bool gen_estimator_active(struct net_rate_estimator __rcu **ptr);
 bool gen_estimator_read(struct net_rate_estimator __rcu **ptr,

--- a/net/core/gen_estimator.c
+++ b/net/core/gen_estimator.c
@@ -112,7 +112,7 @@ static void est_timer(struct timer_list *t)
 * @bstats: basic statistics
 * @cpu_bstats: bstats per cpu
 * @rate_est: rate estimator statistics
- * @stats_lock: statistics lock
+ * @lock: lock for statistics and control path
 * @running: qdisc running seqcount
 * @opt: rate estimator configuration TLV
 *
@@ -128,7 +128,7 @@ static void est_timer(struct timer_list *t)
 int gen_new_estimator(struct gnet_stats_basic_packed *bstats,
 		      struct gnet_stats_basic_cpu __percpu *cpu_bstats,
 		      struct net_rate_estimator __rcu **rate_est,
-		      spinlock_t *stats_lock,
+		      spinlock_t *lock,
 		      seqcount_t *running,
 		      struct nlattr *opt)
 {
@@ -154,19 +154,22 @@ int gen_new_estimator(struct gnet_stats_basic_packed *bstats,
 	seqcount_init(&est->seq);
 	intvl_log = parm->interval + 2;
 	est->bstats = bstats;
-	est->stats_lock = stats_lock;
+	est->stats_lock = lock;
 	est->running  = running;
 	est->ewma_log = parm->ewma_log;
 	est->intvl_log = intvl_log;
 	est->cpu_bstats = cpu_bstats;

-	if (stats_lock)
+	if (lock)
 		local_bh_disable();
 	est_fetch_counters(est, &b);
-	if (stats_lock)
+	if (lock)
 		local_bh_enable();
 	est->last_bytes = b.bytes;
 	est->last_packets = b.packets;
+
+	if (lock)
+		spin_lock_bh(lock);
 	old = rcu_dereference_protected(*rate_est, 1);
 	if (old) {
 		del_timer_sync(&old->timer);
@@ -179,6 +182,8 @@ int gen_new_estimator(struct gnet_stats_basic_packed *bstats,
 	mod_timer(&est->timer, est->next_jiffies);

 	rcu_assign_pointer(*rate_est, est);
+	if (lock)
+		spin_unlock_bh(lock);
 	if (old)
 		kfree_rcu(old, rcu);
 	return 0;
@@ -209,7 +214,7 @@ EXPORT_SYMBOL(gen_kill_estimator);
 * @bstats: basic statistics
 * @cpu_bstats: bstats per cpu
 * @rate_est: rate estimator statistics
- * @stats_lock: statistics lock
+ * @lock: lock for statistics and control path
 * @running: qdisc running seqcount (might be NULL)
 * @opt: rate estimator configuration TLV
 *
@@ -221,11 +226,11 @@ EXPORT_SYMBOL(gen_kill_estimator);
 int gen_replace_estimator(struct gnet_stats_basic_packed *bstats,
 			  struct gnet_stats_basic_cpu __percpu *cpu_bstats,
 			  struct net_rate_estimator __rcu **rate_est,
-			  spinlock_t *stats_lock,
+			  spinlock_t *lock,
 			  seqcount_t *running, struct nlattr *opt)
 {
 	return gen_new_estimator(bstats, cpu_bstats, rate_est,
-				 stats_lock, running, opt);
+				 lock, running, opt);
 }
 EXPORT_SYMBOL(gen_replace_estimator);


--- a/net/sched/act_bpf.c
+++ b/net/sched/act_bpf.c
@@ -143,11 +143,12 @@ static int tcf_bpf_dump(struct sk_buff *skb, struct tc_action *act,
 		.index   = prog->tcf_index,
 		.refcnt  = refcount_read(&prog->tcf_refcnt) - ref,
 		.bindcnt = atomic_read(&prog->tcf_bindcnt) - bind,
-		.action  = prog->tcf_action,
 	};
 	struct tcf_t tm;
 	int ret;

+	spin_lock(&prog->tcf_lock);
+	opt.action = prog->tcf_action;
 	if (nla_put(skb, TCA_ACT_BPF_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;

@@ -163,9 +164,11 @@ static int tcf_bpf_dump(struct sk_buff *skb, struct tc_action *act,
 			  TCA_ACT_BPF_PAD))
 		goto nla_put_failure;

+	spin_unlock(&prog->tcf_lock);
 	return skb->len;

 nla_put_failure:
+	spin_unlock(&prog->tcf_lock);
 	nlmsg_trim(skb, tp);
 	return -1;
 }
@@ -264,7 +267,7 @@ static void tcf_bpf_prog_fill_cfg(const struct tcf_bpf *prog,
 {
 	cfg->is_ebpf = tcf_bpf_is_ebpf(prog);
 	/* updates to prog->filter are prevented, since it's called either
-	 * with rtnl lock or during final cleanup in rcu callback
+	 * with tcf lock or during final cleanup in rcu callback
 	 */
 	cfg->filter = rcu_dereference_protected(prog->filter, 1);

@@ -336,8 +339,8 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla,
 		goto out;

 	prog = to_bpf(*act);
-	ASSERT_RTNL();

+	spin_lock(&prog->tcf_lock);
 	if (res != ACT_P_CREATED)
 		tcf_bpf_prog_fill_cfg(prog, &old);

@@ -349,6 +352,7 @@ static int tcf_bpf_init(struct net *net, struct nlattr *nla,

 	prog->tcf_action = parm->action;
 	rcu_assign_pointer(prog->filter, cfg.filter);
+	spin_unlock(&prog->tcf_lock);

 	if (res == ACT_P_CREATED) {
 		tcf_idr_insert(tn, *act);

--- a/net/sched/act_csum.c
+++ b/net/sched/act_csum.c
@@ -50,7 +50,7 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla,
 			 struct netlink_ext_ack *extack)
 {
 	struct tc_action_net *tn = net_generic(net, csum_net_id);
-	struct tcf_csum_params *params_old, *params_new;
+	struct tcf_csum_params *params_new;
 	struct nlattr *tb[TCA_CSUM_MAX + 1];
 	struct tc_csum *parm;
 	struct tcf_csum *p;
@@ -88,20 +88,22 @@ static int tcf_csum_init(struct net *net, struct nlattr *nla,
 	}

 	p = to_tcf_csum(*a);
-	ASSERT_RTNL();

 	params_new = kzalloc(sizeof(*params_new), GFP_KERNEL);
 	if (unlikely(!params_new)) {
 		tcf_idr_release(*a, bind);
 		return -ENOMEM;
 	}
-	params_old = rtnl_dereference(p->params);
+	params_new->update_flags = parm->update_flags;

+	spin_lock(&p->tcf_lock);
 	p->tcf_action = parm->action;
-	params_new->update_flags = parm->update_flags;
-	rcu_assign_pointer(p->params, params_new);
-	if (params_old)
-		kfree_rcu(params_old, rcu);
+	rcu_swap_protected(p->params, params_new,
+			   lockdep_is_held(&p->tcf_lock));
+	spin_unlock(&p->tcf_lock);
+
+	if (params_new)
+		kfree_rcu(params_new, rcu);

 	if (ret == ACT_P_CREATED)
 		tcf_idr_insert(tn, *a);
@@ -599,11 +601,13 @@ static int tcf_csum_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 		.index   = p->tcf_index,
 		.refcnt  = refcount_read(&p->tcf_refcnt) - ref,
 		.bindcnt = atomic_read(&p->tcf_bindcnt) - bind,
-		.action  = p->tcf_action,
 	};
 	struct tcf_t t;

-	params = rtnl_dereference(p->params);
+	spin_lock(&p->tcf_lock);
+	params = rcu_dereference_protected(p->params,
+					   lockdep_is_held(&p->tcf_lock));
+	opt.action = p->tcf_action;
 	opt.update_flags = params->update_flags;

 	if (nla_put(skb, TCA_CSUM_PARMS, sizeof(opt), &opt))
@@ -612,10 +616,12 @@ static int tcf_csum_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 	tcf_tm_dump(&t, &p->tcf_tm);
 	if (nla_put_64bit(skb, TCA_CSUM_TM, sizeof(t), &t, TCA_CSUM_PAD))
 		goto nla_put_failure;
+	spin_unlock(&p->tcf_lock);

 	return skb->len;

 nla_put_failure:
+	spin_unlock(&p->tcf_lock);
 	nlmsg_trim(skb, b);
 	return -1;
 }

--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -113,7 +113,7 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla,

 	gact = to_gact(*a);

-	ASSERT_RTNL();
+	spin_lock(&gact->tcf_lock);
 	gact->tcf_action = parm->action;
 #ifdef CONFIG_GACT_PROB
 	if (p_parm) {
@@ -126,6 +126,8 @@ static int tcf_gact_init(struct net *net, struct nlattr *nla,
 		gact->tcfg_ptype   = p_parm->ptype;
 	}
 #endif
+	spin_unlock(&gact->tcf_lock);
+
 	if (ret == ACT_P_CREATED)
 		tcf_idr_insert(tn, *a);
 	return ret;
@@ -178,10 +180,11 @@ static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a,
 		.index   = gact->tcf_index,
 		.refcnt  = refcount_read(&gact->tcf_refcnt) - ref,
 		.bindcnt = atomic_read(&gact->tcf_bindcnt) - bind,
-		.action  = gact->tcf_action,
 	};
 	struct tcf_t t;

+	spin_lock(&gact->tcf_lock);
+	opt.action = gact->tcf_action;
 	if (nla_put(skb, TCA_GACT_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;
 #ifdef CONFIG_GACT_PROB
@@ -199,9 +202,12 @@ static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a,
 	tcf_tm_dump(&t, &gact->tcf_tm);
 	if (nla_put_64bit(skb, TCA_GACT_TM, sizeof(t), &t, TCA_GACT_PAD))
 		goto nla_put_failure;
+	spin_unlock(&gact->tcf_lock);
+
 	return skb->len;

 nla_put_failure:
+	spin_unlock(&gact->tcf_lock);
 	nlmsg_trim(skb, b);
 	return -1;
 }

--- a/net/sched/act_ife.c
+++ b/net/sched/act_ife.c
@@ -268,7 +268,8 @@ static const char *ife_meta_id2name(u32 metaid)
 * under ife->tcf_lock for existing action
 */
 static int load_metaops_and_vet(struct tcf_ife_info *ife, u32 metaid,
-				void *val, int len, bool exists)
+				void *val, int len, bool exists,
+				bool rtnl_held)
 {
 	struct tcf_meta_ops *ops = find_ife_oplist(metaid);
 	int ret = 0;
@@ -278,8 +279,10 @@ static int load_metaops_and_vet(struct tcf_ife_info *ife, u32 metaid,
 #ifdef CONFIG_MODULES
 		if (exists)
 			spin_unlock_bh(&ife->tcf_lock);
+		if (rtnl_held)
 			rtnl_unlock();
 		request_module("ife-meta-%s", ife_meta_id2name(metaid));
+		if (rtnl_held)
 			rtnl_lock();
 		if (exists)
 			spin_lock_bh(&ife->tcf_lock);
@@ -421,7 +424,7 @@ static void tcf_ife_cleanup(struct tc_action *a)

 /* under ife->tcf_lock for existing action */
 static int populate_metalist(struct tcf_ife_info *ife, struct nlattr **tb,
-			     bool exists)
+			     bool exists, bool rtnl_held)
 {
 	int len = 0;
 	int rc = 0;
@@ -433,7 +436,8 @@ static int populate_metalist(struct tcf_ife_info *ife, struct nlattr **tb,
 			val = nla_data(tb[i]);
 			len = nla_len(tb[i]);

-			rc = load_metaops_and_vet(ife, i, val, len, exists);
+			rc = load_metaops_and_vet(ife, i, val, len, exists,
+						  rtnl_held);
 			if (rc != 0)
 				return rc;

@@ -454,7 +458,7 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
 	struct tc_action_net *tn = net_generic(net, ife_net_id);
 	struct nlattr *tb[TCA_IFE_MAX + 1];
 	struct nlattr *tb2[IFE_META_MAX + 1];
-	struct tcf_ife_params *p, *p_old;
+	struct tcf_ife_params *p;
 	struct tcf_ife_info *ife;
 	u16 ife_type = ETH_P_IFE;
 	struct tc_ife *parm;
@@ -558,7 +562,7 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
 			return err;
 		}

-		err = populate_metalist(ife, tb2, exists);
+		err = populate_metalist(ife, tb2, exists, rtnl_held);
 		if (err)
 			goto metadata_parse_err;

@@ -581,13 +585,13 @@ static int tcf_ife_init(struct net *net, struct nlattr *nla,
 	}

 	ife->tcf_action = parm->action;
+	/* protected by tcf_lock when modifying existing action */
+	rcu_swap_protected(ife->params, p, 1);
+
 	if (exists)
 		spin_unlock_bh(&ife->tcf_lock);
-
-	p_old = rtnl_dereference(ife->params);
-	rcu_assign_pointer(ife->params, p);
-	if (p_old)
-		kfree_rcu(p_old, rcu);
+	if (p)
+		kfree_rcu(p, rcu);

 	if (ret == ACT_P_CREATED)
 		tcf_idr_insert(tn, *a);
@@ -600,16 +604,20 @@ static int tcf_ife_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 {
 	unsigned char *b = skb_tail_pointer(skb);
 	struct tcf_ife_info *ife = to_ife(a);
-	struct tcf_ife_params *p = rtnl_dereference(ife->params);
+	struct tcf_ife_params *p;
 	struct tc_ife opt = {
 		.index = ife->tcf_index,
 		.refcnt = refcount_read(&ife->tcf_refcnt) - ref,
 		.bindcnt = atomic_read(&ife->tcf_bindcnt) - bind,
-		.action = ife->tcf_action,
-		.flags = p->flags,
 	};
 	struct tcf_t t;

+	spin_lock_bh(&ife->tcf_lock);
+	opt.action = ife->tcf_action;
+	p = rcu_dereference_protected(ife->params,
+				      lockdep_is_held(&ife->tcf_lock));
+	opt.flags = p->flags;
+
 	if (nla_put(skb, TCA_IFE_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;

@@ -635,9 +643,11 @@ static int tcf_ife_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 		pr_info("Failed to dump metalist\n");
 	}

+	spin_unlock_bh(&ife->tcf_lock);
 	return skb->len;

 nla_put_failure:
+	spin_unlock_bh(&ife->tcf_lock);
 	nlmsg_trim(skb, b);
 	return -1;
 }

--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -288,6 +288,7 @@ static int tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 	 * for foolproof you need to not assume this
 	 */

+	spin_lock_bh(&ipt->tcf_lock);
 	t = kmemdup(ipt->tcfi_t, ipt->tcfi_t->u.user.target_size, GFP_ATOMIC);
 	if (unlikely(!t))
 		goto nla_put_failure;
@@ -307,10 +308,12 @@ static int tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 	if (nla_put_64bit(skb, TCA_IPT_TM, sizeof(tm), &tm, TCA_IPT_PAD))
 		goto nla_put_failure;

+	spin_unlock_bh(&ipt->tcf_lock);
 	kfree(t);
 	return skb->len;

 nla_put_failure:
+	spin_unlock_bh(&ipt->tcf_lock);
 	nlmsg_trim(skb, b);
 	kfree(t);
 	return -1;

--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -30,6 +30,7 @@
 #include <net/tc_act/tc_mirred.h>

 static LIST_HEAD(mirred_list);
+static DEFINE_SPINLOCK(mirred_list_lock);

 static bool tcf_mirred_is_act_redirect(int action)
 {
@@ -62,13 +63,23 @@ static bool tcf_mirred_can_reinsert(int action)
 	return false;
 }

+static struct net_device *tcf_mirred_dev_dereference(struct tcf_mirred *m)
+{
+	return rcu_dereference_protected(m->tcfm_dev,
+					 lockdep_is_held(&m->tcf_lock));
+}
+
 static void tcf_mirred_release(struct tc_action *a)
 {
 	struct tcf_mirred *m = to_mirred(a);
 	struct net_device *dev;

+	spin_lock(&mirred_list_lock);
 	list_del(&m->tcfm_list);
-	dev = rtnl_dereference(m->tcfm_dev);
+	spin_unlock(&mirred_list_lock);
+
+	/* last reference to action, no need to lock */
+	dev = rcu_dereference_protected(m->tcfm_dev, 1);
 	if (dev)
 		dev_put(dev);
 }
@@ -128,22 +139,9 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 		NL_SET_ERR_MSG_MOD(extack, "Unknown mirred option");
 		return -EINVAL;
 	}
-	if (parm->ifindex) {
-		dev = __dev_get_by_index(net, parm->ifindex);
-		if (dev == NULL) {
-			if (exists)
-				tcf_idr_release(*a, bind);
-			else
-				tcf_idr_cleanup(tn, parm->index);
-			return -ENODEV;
-		}
-		mac_header_xmit = dev_is_mac_header_xmit(dev);
-	} else {
-		dev = NULL;
-	}

 	if (!exists) {
-		if (!dev) {
+		if (!parm->ifindex) {
 			tcf_idr_cleanup(tn, parm->index);
 			NL_SET_ERR_MSG_MOD(extack, "Specified device does not exist");
 			return -EINVAL;
@@ -161,19 +159,31 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla,
 	}
 	m = to_mirred(*a);

-	ASSERT_RTNL();
+	spin_lock(&m->tcf_lock);
 	m->tcf_action = parm->action;
 	m->tcfm_eaction = parm->eaction;
-	if (dev != NULL) {
-		if (ret != ACT_P_CREATED)
-			dev_put(rcu_dereference_protected(m->tcfm_dev, 1));
-		dev_hold(dev);
-		rcu_assign_pointer(m->tcfm_dev, dev);
+
+	if (parm->ifindex) {
+		dev = dev_get_by_index(net, parm->ifindex);
+		if (!dev) {
+			spin_unlock(&m->tcf_lock);
+			tcf_idr_release(*a, bind);
+			return -ENODEV;
+		}
+		mac_header_xmit = dev_is_mac_header_xmit(dev);
+		rcu_swap_protected(m->tcfm_dev, dev,
+				   lockdep_is_held(&m->tcf_lock));
+		if (dev)
+			dev_put(dev);
 		m->tcfm_mac_header_xmit = mac_header_xmit;
 	}
+	spin_unlock(&m->tcf_lock);

 	if (ret == ACT_P_CREATED) {
+		spin_lock(&mirred_list_lock);
 		list_add(&m->tcfm_list, &mirred_list);
+		spin_unlock(&mirred_list_lock);
+
 		tcf_idr_insert(tn, *a);
 	}

@@ -287,26 +297,33 @@ static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind,
 {
 	unsigned char *b = skb_tail_pointer(skb);
 	struct tcf_mirred *m = to_mirred(a);
-	struct net_device *dev = rtnl_dereference(m->tcfm_dev);
 	struct tc_mirred opt = {
 		.index   = m->tcf_index,
-		.action  = m->tcf_action,
 		.refcnt  = refcount_read(&m->tcf_refcnt) - ref,
 		.bindcnt = atomic_read(&m->tcf_bindcnt) - bind,
-		.eaction = m->tcfm_eaction,
-		.ifindex = dev ? dev->ifindex : 0,
 	};
+	struct net_device *dev;
 	struct tcf_t t;

+	spin_lock(&m->tcf_lock);
+	opt.action = m->tcf_action;
+	opt.eaction = m->tcfm_eaction;
+	dev = tcf_mirred_dev_dereference(m);
+	if (dev)
+		opt.ifindex = dev->ifindex;
+
 	if (nla_put(skb, TCA_MIRRED_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;

 	tcf_tm_dump(&t, &m->tcf_tm);
 	if (nla_put_64bit(skb, TCA_MIRRED_TM, sizeof(t), &t, TCA_MIRRED_PAD))
 		goto nla_put_failure;
+	spin_unlock(&m->tcf_lock);
+
 	return skb->len;

 nla_put_failure:
+	spin_unlock(&m->tcf_lock);
 	nlmsg_trim(skb, b);
 	return -1;
 }
@@ -337,15 +354,19 @@ static int mirred_device_event(struct notifier_block *unused,

 	ASSERT_RTNL();
 	if (event == NETDEV_UNREGISTER) {
+		spin_lock(&mirred_list_lock);
 		list_for_each_entry(m, &mirred_list, tcfm_list) {
-			if (rcu_access_pointer(m->tcfm_dev) == dev) {
+			spin_lock(&m->tcf_lock);
+			if (tcf_mirred_dev_dereference(m) == dev) {
 				dev_put(dev);
 				/* Note : no rcu grace period necessary, as
 				 * net_device are already rcu protected.
 				 */
 				RCU_INIT_POINTER(m->tcfm_dev, NULL);
 			}
+			spin_unlock(&m->tcf_lock);
 		}
+		spin_unlock(&mirred_list_lock);
 	}

 	return NOTIFY_DONE;
@@ -358,8 +379,20 @@ static struct notifier_block mirred_device_notifier = {
 static struct net_device *tcf_mirred_get_dev(const struct tc_action *a)
 {
 	struct tcf_mirred *m = to_mirred(a);
+	struct net_device *dev;
+
+	rcu_read_lock();
+	dev = rcu_dereference(m->tcfm_dev);
+	if (dev)
+		dev_hold(dev);
+	rcu_read_unlock();

-	return rtnl_dereference(m->tcfm_dev);
+	return dev;
+}
+
+static void tcf_mirred_put_dev(struct net_device *dev)
+{
+	dev_put(dev);
 }

 static int tcf_mirred_delete(struct net *net, u32 index)
@@ -382,6 +415,7 @@ static struct tc_action_ops act_mirred_ops = {
 	.lookup		=	tcf_mirred_search,
 	.size		=	sizeof(struct tcf_mirred),
 	.get_dev	=	tcf_mirred_get_dev,
+	.put_dev	=	tcf_mirred_put_dev,
 	.delete		=	tcf_mirred_delete,
 };


--- a/net/sched/act_pedit.c
+++ b/net/sched/act_pedit.c
@@ -187,44 +187,38 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla,
 			tcf_idr_cleanup(tn, parm->index);
 			goto out_free;
 		}
-		p = to_pedit(*a);
-		keys = kmalloc(ksize, GFP_KERNEL);
-		if (!keys) {
-			tcf_idr_release(*a, bind);
-			ret = -ENOMEM;
-			goto out_free;
-		}
 		ret = ACT_P_CREATED;
 	} else if (err > 0) {
 		if (bind)
 			goto out_free;
 		if (!ovr) {
-			tcf_idr_release(*a, bind);
 			ret = -EEXIST;
-			goto out_free;
-		}
-		p = to_pedit(*a);
-		if (p->tcfp_nkeys && p->tcfp_nkeys != parm->nkeys) {
-			keys = kmalloc(ksize, GFP_KERNEL);
-			if (!keys) {
-				ret = -ENOMEM;
-				goto out_free;
-			}
+			goto out_release;
 		}
 	} else {
 		return err;
 	}

+	p = to_pedit(*a);
 	spin_lock_bh(&p->tcf_lock);
-	p->tcfp_flags = parm->flags;
-	p->tcf_action = parm->action;
-	if (keys) {
+
+	if (ret == ACT_P_CREATED ||
+	    (p->tcfp_nkeys && p->tcfp_nkeys != parm->nkeys)) {
+		keys = kmalloc(ksize, GFP_ATOMIC);
+		if (!keys) {
+			spin_unlock_bh(&p->tcf_lock);
+			ret = -ENOMEM;
+			goto out_release;
+		}
 		kfree(p->tcfp_keys);
 		p->tcfp_keys = keys;
 		p->tcfp_nkeys = parm->nkeys;
 	}
 	memcpy(p->tcfp_keys, parm->keys, ksize);

+	p->tcfp_flags = parm->flags;
+	p->tcf_action = parm->action;
+
 	kfree(p->tcfp_keys_ex);
 	p->tcfp_keys_ex = keys_ex;

@@ -232,6 +226,9 @@ static int tcf_pedit_init(struct net *net, struct nlattr *nla,
 	if (ret == ACT_P_CREATED)
 		tcf_idr_insert(tn, *a);
 	return ret;
+
+out_release:
+	tcf_idr_release(*a, bind);
 out_free:
 	kfree(keys_ex);
 	return ret;
@@ -410,6 +407,7 @@ static int tcf_pedit_dump(struct sk_buff *skb, struct tc_action *a,
 	if (unlikely(!opt))
 		return -ENOBUFS;

+	spin_lock_bh(&p->tcf_lock);
 	memcpy(opt->keys, p->tcfp_keys,
 	       p->tcfp_nkeys * sizeof(struct tc_pedit_key));
 	opt->index = p->tcf_index;
@@ -432,11 +430,13 @@ static int tcf_pedit_dump(struct sk_buff *skb, struct tc_action *a,
 	tcf_tm_dump(&t, &p->tcf_tm);
 	if (nla_put_64bit(skb, TCA_PEDIT_TM, sizeof(t), &t, TCA_PEDIT_PAD))
 		goto nla_put_failure;
+	spin_unlock_bh(&p->tcf_lock);

 	kfree(opt);
 	return skb->len;

 nla_put_failure:
+	spin_unlock_bh(&p->tcf_lock);
 	nlmsg_trim(skb, b);
 	kfree(opt);
 	return -1;

--- a/net/sched/act_police.c
+++ b/net/sched/act_police.c
@@ -274,14 +274,15 @@ static int tcf_act_police_dump(struct sk_buff *skb, struct tc_action *a,
 	struct tcf_police *police = to_police(a);
 	struct tc_police opt = {
 		.index = police->tcf_index,
-		.action = police->tcf_action,
-		.mtu = police->tcfp_mtu,
-		.burst = PSCHED_NS2TICKS(police->tcfp_burst),
 		.refcnt = refcount_read(&police->tcf_refcnt) - ref,
 		.bindcnt = atomic_read(&police->tcf_bindcnt) - bind,
 	};
 	struct tcf_t t;

+	spin_lock_bh(&police->tcf_lock);
+	opt.action = police->tcf_action;
+	opt.mtu = police->tcfp_mtu;
+	opt.burst = PSCHED_NS2TICKS(police->tcfp_burst);
 	if (police->rate_present)
 		psched_ratecfg_getrate(&opt.rate, &police->rate);
 	if (police->peak_present)
@@ -301,10 +302,12 @@ static int tcf_act_police_dump(struct sk_buff *skb, struct tc_action *a,
 	t.expires = jiffies_to_clock_t(police->tcf_tm.expires);
 	if (nla_put_64bit(skb, TCA_POLICE_TM, sizeof(t), &t, TCA_POLICE_PAD))
 		goto nla_put_failure;
+	spin_unlock_bh(&police->tcf_lock);

 	return skb->len;

 nla_put_failure:
+	spin_unlock_bh(&police->tcf_lock);
 	nlmsg_trim(skb, b);
 	return -1;
 }

--- a/net/sched/act_sample.c
+++ b/net/sched/act_sample.c
@@ -80,11 +80,13 @@ static int tcf_sample_init(struct net *net, struct nlattr *nla,
 	}
 	s = to_sample(*a);

+	spin_lock(&s->tcf_lock);
 	s->tcf_action = parm->action;
 	s->rate = nla_get_u32(tb[TCA_SAMPLE_RATE]);
 	s->psample_group_num = nla_get_u32(tb[TCA_SAMPLE_PSAMPLE_GROUP]);
 	psample_group = psample_group_get(net, s->psample_group_num);
 	if (!psample_group) {
+		spin_unlock(&s->tcf_lock);
 		tcf_idr_release(*a, bind);
 		return -ENOMEM;
 	}
@@ -94,6 +96,7 @@ static int tcf_sample_init(struct net *net, struct nlattr *nla,
 		s->truncate = true;
 		s->trunc_size = nla_get_u32(tb[TCA_SAMPLE_TRUNC_SIZE]);
 	}
+	spin_unlock(&s->tcf_lock);

 	if (ret == ACT_P_CREATED)
 		tcf_idr_insert(tn, *a);
@@ -105,7 +108,8 @@ static void tcf_sample_cleanup(struct tc_action *a)
 	struct tcf_sample *s = to_sample(a);
 	struct psample_group *psample_group;

-	psample_group = rtnl_dereference(s->psample_group);
+	/* last reference to action, no need to lock */
+	psample_group = rcu_dereference_protected(s->psample_group, 1);
 	RCU_INIT_POINTER(s->psample_group, NULL);
 	if (psample_group)
 		psample_group_put(psample_group);
@@ -174,12 +178,13 @@ static int tcf_sample_dump(struct sk_buff *skb, struct tc_action *a,
 	struct tcf_sample *s = to_sample(a);
 	struct tc_sample opt = {
 		.index      = s->tcf_index,
-		.action     = s->tcf_action,
 		.refcnt     = refcount_read(&s->tcf_refcnt) - ref,
 		.bindcnt    = atomic_read(&s->tcf_bindcnt) - bind,
 	};
 	struct tcf_t t;

+	spin_lock(&s->tcf_lock);
+	opt.action = s->tcf_action;
 	if (nla_put(skb, TCA_SAMPLE_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;

@@ -196,9 +201,12 @@ static int tcf_sample_dump(struct sk_buff *skb, struct tc_action *a,

 	if (nla_put_u32(skb, TCA_SAMPLE_PSAMPLE_GROUP, s->psample_group_num))
 		goto nla_put_failure;
+	spin_unlock(&s->tcf_lock);
+
 	return skb->len;

 nla_put_failure:
+	spin_unlock(&s->tcf_lock);
 	nlmsg_trim(skb, b);
 	return -1;
 }

--- a/net/sched/act_simple.c
+++ b/net/sched/act_simple.c
@@ -156,10 +156,11 @@ static int tcf_simp_dump(struct sk_buff *skb, struct tc_action *a,
 		.index   = d->tcf_index,
 		.refcnt  = refcount_read(&d->tcf_refcnt) - ref,
 		.bindcnt = atomic_read(&d->tcf_bindcnt) - bind,
-		.action  = d->tcf_action,
 	};
 	struct tcf_t t;

+	spin_lock_bh(&d->tcf_lock);
+	opt.action = d->tcf_action;
 	if (nla_put(skb, TCA_DEF_PARMS, sizeof(opt), &opt) ||
 	    nla_put_string(skb, TCA_DEF_DATA, d->tcfd_defdata))
 		goto nla_put_failure;
@@ -167,9 +168,12 @@ static int tcf_simp_dump(struct sk_buff *skb, struct tc_action *a,
 	tcf_tm_dump(&t, &d->tcf_tm);
 	if (nla_put_64bit(skb, TCA_DEF_TM, sizeof(t), &t, TCA_DEF_PAD))
 		goto nla_put_failure;
+	spin_unlock_bh(&d->tcf_lock);
+
 	return skb->len;

 nla_put_failure:
+	spin_unlock_bh(&d->tcf_lock);
 	nlmsg_trim(skb, b);
 	return -1;
 }

--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -156,7 +156,6 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla,

 	d = to_skbmod(*a);

-	ASSERT_RTNL();
 	p = kzalloc(sizeof(struct tcf_skbmod_params), GFP_KERNEL);
 	if (unlikely(!p)) {
 		tcf_idr_release(*a, bind);
@@ -166,10 +165,10 @@ static int tcf_skbmod_init(struct net *net, struct nlattr *nla,
 	p->flags = lflags;
 	d->tcf_action = parm->action;

-	p_old = rtnl_dereference(d->skbmod_p);
-
 	if (ovr)
 		spin_lock_bh(&d->tcf_lock);
+	/* Protected by tcf_lock if overwriting existing action. */
+	p_old = rcu_dereference_protected(d->skbmod_p, 1);

 	if (lflags & SKBMOD_F_DMAC)
 		ether_addr_copy(p->eth_dst, daddr);
@@ -205,15 +204,18 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a,
 {
 	struct tcf_skbmod *d = to_skbmod(a);
 	unsigned char *b = skb_tail_pointer(skb);
-	struct tcf_skbmod_params  *p = rtnl_dereference(d->skbmod_p);
+	struct tcf_skbmod_params  *p;
 	struct tc_skbmod opt = {
 		.index   = d->tcf_index,
 		.refcnt  = refcount_read(&d->tcf_refcnt) - ref,
 		.bindcnt = atomic_read(&d->tcf_bindcnt) - bind,
-		.action  = d->tcf_action,
 	};
 	struct tcf_t t;

+	spin_lock_bh(&d->tcf_lock);
+	opt.action = d->tcf_action;
+	p = rcu_dereference_protected(d->skbmod_p,
+				      lockdep_is_held(&d->tcf_lock));
 	opt.flags  = p->flags;
 	if (nla_put(skb, TCA_SKBMOD_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;
@@ -231,8 +233,10 @@ static int tcf_skbmod_dump(struct sk_buff *skb, struct tc_action *a,
 	if (nla_put_64bit(skb, TCA_SKBMOD_TM, sizeof(t), &t, TCA_SKBMOD_PAD))
 		goto nla_put_failure;

+	spin_unlock_bh(&d->tcf_lock);
 	return skb->len;
 nla_put_failure:
+	spin_unlock_bh(&d->tcf_lock);
 	nlmsg_trim(skb, b);
 	return -1;
 }

--- a/net/sched/act_tunnel_key.c
+++ b/net/sched/act_tunnel_key.c
@@ -204,7 +204,6 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,
 {
 	struct tc_action_net *tn = net_generic(net, tunnel_key_net_id);
 	struct nlattr *tb[TCA_TUNNEL_KEY_MAX + 1];
-	struct tcf_tunnel_key_params *params_old;
 	struct tcf_tunnel_key_params *params_new;
 	struct metadata_dst *metadata = NULL;
 	struct tc_tunnel_key *parm;
@@ -346,24 +345,22 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,

 	t = to_tunnel_key(*a);

-	ASSERT_RTNL();
 	params_new = kzalloc(sizeof(*params_new), GFP_KERNEL);
 	if (unlikely(!params_new)) {
 		tcf_idr_release(*a, bind);
 		NL_SET_ERR_MSG(extack, "Cannot allocate tunnel key parameters");
 		return -ENOMEM;
 	}
-
-	params_old = rtnl_dereference(t->params);
-
-	t->tcf_action = parm->action;
 	params_new->tcft_action = parm->t_action;
 	params_new->tcft_enc_metadata = metadata;

-	rcu_assign_pointer(t->params, params_new);
-
-	if (params_old)
-		kfree_rcu(params_old, rcu);
+	spin_lock(&t->tcf_lock);
+	t->tcf_action = parm->action;
+	rcu_swap_protected(t->params, params_new,
+			   lockdep_is_held(&t->tcf_lock));
+	spin_unlock(&t->tcf_lock);
+	if (params_new)
+		kfree_rcu(params_new, rcu);

 	if (ret == ACT_P_CREATED)
 		tcf_idr_insert(tn, *a);
@@ -485,12 +482,13 @@ static int tunnel_key_dump(struct sk_buff *skb, struct tc_action *a,
 		.index    = t->tcf_index,
 		.refcnt   = refcount_read(&t->tcf_refcnt) - ref,
 		.bindcnt  = atomic_read(&t->tcf_bindcnt) - bind,
-		.action   = t->tcf_action,
 	};
 	struct tcf_t tm;

-	params = rtnl_dereference(t->params);
-
+	spin_lock(&t->tcf_lock);
+	params = rcu_dereference_protected(t->params,
+					   lockdep_is_held(&t->tcf_lock));
+	opt.action   = t->tcf_action;
 	opt.t_action = params->tcft_action;

 	if (nla_put(skb, TCA_TUNNEL_KEY_PARMS, sizeof(opt), &opt))
@@ -522,10 +520,12 @@ static int tunnel_key_dump(struct sk_buff *skb, struct tc_action *a,
 	if (nla_put_64bit(skb, TCA_TUNNEL_KEY_TM, sizeof(tm),
 			  &tm, TCA_TUNNEL_KEY_PAD))
 		goto nla_put_failure;
+	spin_unlock(&t->tcf_lock);

 	return skb->len;

 nla_put_failure:
+	spin_unlock(&t->tcf_lock);
 	nlmsg_trim(skb, b);
 	return -1;
 }

--- a/net/sched/act_vlan.c
+++ b/net/sched/act_vlan.c
@@ -109,7 +109,7 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,
 {
 	struct tc_action_net *tn = net_generic(net, vlan_net_id);
 	struct nlattr *tb[TCA_VLAN_MAX + 1];
-	struct tcf_vlan_params *p, *p_old;
+	struct tcf_vlan_params *p;
 	struct tc_vlan *parm;
 	struct tcf_vlan *v;
 	int action;
@@ -202,26 +202,24 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,

 	v = to_vlan(*a);

-	ASSERT_RTNL();
 	p = kzalloc(sizeof(*p), GFP_KERNEL);
 	if (!p) {
 		tcf_idr_release(*a, bind);
 		return -ENOMEM;
 	}

-	v->tcf_action = parm->action;
-
-	p_old = rtnl_dereference(v->vlan_p);
-
 	p->tcfv_action = action;
 	p->tcfv_push_vid = push_vid;
 	p->tcfv_push_prio = push_prio;
 	p->tcfv_push_proto = push_proto;

-	rcu_assign_pointer(v->vlan_p, p);
+	spin_lock(&v->tcf_lock);
+	v->tcf_action = parm->action;
+	rcu_swap_protected(v->vlan_p, p, lockdep_is_held(&v->tcf_lock));
+	spin_unlock(&v->tcf_lock);

-	if (p_old)
-		kfree_rcu(p_old, rcu);
+	if (p)
+		kfree_rcu(p, rcu);

 	if (ret == ACT_P_CREATED)
 		tcf_idr_insert(tn, *a);
@@ -243,16 +241,18 @@ static int tcf_vlan_dump(struct sk_buff *skb, struct tc_action *a,
 {
 	unsigned char *b = skb_tail_pointer(skb);
 	struct tcf_vlan *v = to_vlan(a);
-	struct tcf_vlan_params *p = rtnl_dereference(v->vlan_p);
+	struct tcf_vlan_params *p;
 	struct tc_vlan opt = {
 		.index    = v->tcf_index,
 		.refcnt   = refcount_read(&v->tcf_refcnt) - ref,
 		.bindcnt  = atomic_read(&v->tcf_bindcnt) - bind,
-		.action   = v->tcf_action,
-		.v_action = p->tcfv_action,
 	};
 	struct tcf_t t;

+	spin_lock(&v->tcf_lock);
+	opt.action = v->tcf_action;
+	p = rcu_dereference_protected(v->vlan_p, lockdep_is_held(&v->tcf_lock));
+	opt.v_action = p->tcfv_action;
 	if (nla_put(skb, TCA_VLAN_PARMS, sizeof(opt), &opt))
 		goto nla_put_failure;

@@ -268,9 +268,12 @@ static int tcf_vlan_dump(struct sk_buff *skb, struct tc_action *a,
 	tcf_tm_dump(&t, &v->tcf_tm);
 	if (nla_put_64bit(skb, TCA_VLAN_TM, sizeof(t), &t, TCA_VLAN_PAD))
 		goto nla_put_failure;
+	spin_unlock(&v->tcf_lock);
+
 	return skb->len;

 nla_put_failure:
+	spin_unlock(&v->tcf_lock);
 	nlmsg_trim(skb, b);
 	return -1;
 }

--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -2176,6 +2176,7 @@ static int tc_exts_setup_cb_egdev_call(struct tcf_exts *exts,
 		if (!dev)
 			continue;
 		ret = tc_setup_cb_egdev_call(dev, type, type_data, err_stop);
+		a->ops->put_dev(dev);
 		if (ret < 0)
 			return ret;
 		ok_count += ret;