Commit e68409db authored by Jamal Hadi Salim's avatar Jamal Hadi Salim Committed by Jakub Kicinski

net: sched: cls_u32: Fix match key mis-addressing

A match entry is uniquely identified with an "address" or "path" in the
form of: hashtable ID(12b):bucketid(8b):nodeid(12b).

When creating table match entries all of hash table id, bucket id and
node (match entry id) are needed to be either specified by the user or
reasonable in-kernel defaults are used. The in-kernel default for a table id is
0x800(omnipresent root table); for bucketid it is 0x0. Prior to this fix there
was none for a nodeid i.e. the code assumed that the user passed the correct
nodeid and if the user passes a nodeid of 0 (as Mingi Cho did) then that is what
was used. But nodeid of 0 is reserved for identifying the table. This is not
a problem until we dump. The dump code notices that the nodeid is zero and
assumes it is referencing a table and therefore references table struct
tc_u_hnode instead of what was created i.e match entry struct tc_u_knode.

Ming does an equivalent of:
tc filter add dev dummy0 parent 10: prio 1 handle 0x1000 \
protocol ip u32 match ip src 10.0.0.1/32 classid 10:1 action ok

Essentially specifying a table id 0, bucketid 1 and nodeid of zero
Tableid 0 is remapped to the default of 0x800.
Bucketid 1 is ignored and defaults to 0x00.
Nodeid was assumed to be what Ming passed - 0x000

dumping before fix shows:
~$ tc filter ls dev dummy0 parent 10:
filter protocol ip pref 1 u32 chain 0
filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1
filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor -30591

Note that the last line reports a table instead of a match entry
(you can tell this because it says "ht divisor...").
As a result of reporting the wrong data type (misinterpretting of struct
tc_u_knode as being struct tc_u_hnode) the divisor is reported with value
of -30591. Ming identified this as part of the heap address
(physmap_base is 0xffff8880 (-30591 - 1)).

The fix is to ensure that when table entry matches are added and no
nodeid is specified (i.e nodeid == 0) then we get the next available
nodeid from the table's pool.

After the fix, this is what the dump shows:
$ tc filter ls dev dummy0 parent 10:
filter protocol ip pref 1 u32 chain 0
filter protocol ip pref 1 u32 chain 0 fh 800: ht divisor 1
filter protocol ip pref 1 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 flowid 10:1 not_in_hw
  match 0a000001/ffffffff at 12
	action order 1: gact action pass
	 random type none pass val 0
	 index 1 ref 1 bind 1
Reported-by: default avatarMingi Cho <mgcho.minic@gmail.com>
Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20230726135151.416917-1-jhs@mojatatu.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parent 5416d792
...@@ -1024,18 +1024,62 @@ static int u32_change(struct net *net, struct sk_buff *in_skb, ...@@ -1024,18 +1024,62 @@ static int u32_change(struct net *net, struct sk_buff *in_skb,
return -EINVAL; return -EINVAL;
} }
/* At this point, we need to derive the new handle that will be used to
* uniquely map the identity of this table match entry. The
* identity of the entry that we need to construct is 32 bits made of:
* htid(12b):bucketid(8b):node/entryid(12b)
*
* At this point _we have the table(ht)_ in which we will insert this
* entry. We carry the table's id in variable "htid".
* Note that earlier code picked the ht selection either by a) the user
* providing the htid specified via TCA_U32_HASH attribute or b) when
* no such attribute is passed then the root ht, is default to at ID
* 0x[800][00][000]. Rule: the root table has a single bucket with ID 0.
* If OTOH the user passed us the htid, they may also pass a bucketid of
* choice. 0 is fine. For example a user htid is 0x[600][01][000] it is
* indicating hash bucketid of 1. Rule: the entry/node ID _cannot_ be
* passed via the htid, so even if it was non-zero it will be ignored.
*
* We may also have a handle, if the user passed one. The handle also
* carries the same addressing of htid(12b):bucketid(8b):node/entryid(12b).
* Rule: the bucketid on the handle is ignored even if one was passed;
* rather the value on "htid" is always assumed to be the bucketid.
*/
if (handle) { if (handle) {
/* Rule: The htid from handle and tableid from htid must match */
if (TC_U32_HTID(handle) && TC_U32_HTID(handle ^ htid)) { if (TC_U32_HTID(handle) && TC_U32_HTID(handle ^ htid)) {
NL_SET_ERR_MSG_MOD(extack, "Handle specified hash table address mismatch"); NL_SET_ERR_MSG_MOD(extack, "Handle specified hash table address mismatch");
return -EINVAL; return -EINVAL;
} }
handle = htid | TC_U32_NODE(handle); /* Ok, so far we have a valid htid(12b):bucketid(8b) but we
err = idr_alloc_u32(&ht->handle_idr, NULL, &handle, handle, * need to finalize the table entry identification with the last
GFP_KERNEL); * part - the node/entryid(12b)). Rule: Nodeid _cannot be 0_ for
if (err) * entries. Rule: nodeid of 0 is reserved only for tables(see
return err; * earlier code which processes TC_U32_DIVISOR attribute).
} else * Rule: The nodeid can only be derived from the handle (and not
* htid).
* Rule: if the handle specified zero for the node id example
* 0x60000000, then pick a new nodeid from the pool of IDs
* this hash table has been allocating from.
* If OTOH it is specified (i.e for example the user passed a
* handle such as 0x60000123), then we use it generate our final
* handle which is used to uniquely identify the match entry.
*/
if (!TC_U32_NODE(handle)) {
handle = gen_new_kid(ht, htid);
} else {
handle = htid | TC_U32_NODE(handle);
err = idr_alloc_u32(&ht->handle_idr, NULL, &handle,
handle, GFP_KERNEL);
if (err)
return err;
}
} else {
/* The user did not give us a handle; lets just generate one
* from the table's pool of nodeids.
*/
handle = gen_new_kid(ht, htid); handle = gen_new_kid(ht, htid);
}
if (tb[TCA_U32_SEL] == NULL) { if (tb[TCA_U32_SEL] == NULL) {
NL_SET_ERR_MSG_MOD(extack, "Selector not specified"); NL_SET_ERR_MSG_MOD(extack, "Selector not specified");
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment