- 09 Mar, 2020 24 commits
-
-
Eric Dumazet authored
Convert zones_lock spinlock to zones_mutex mutex, and struct (tcf_ct_flow_table)->ref to a refcount, so that control path can use regular GFP_KERNEL allocations from standard process context. This is more robust in case of memory pressure. The refcount is needed because tcf_ct_flow_table_put() can be called from RCU callback, thus in BH context. The issue was spotted by syzbot, as rhashtable_init() was called with a spinlock held, which is bad since GFP_KERNEL allocations can sleep. Note to developers : Please make sure your patches are tested with CONFIG_DEBUG_ATOMIC_SLEEP=y BUG: sleeping function called from invalid context at mm/slab.h:565 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9582, name: syz-executor610 2 locks held by syz-executor610/9582: #0: ffffffff8a34eb80 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:72 [inline] #0: ffffffff8a34eb80 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x3f9/0xad0 net/core/rtnetlink.c:5437 #1: ffffffff8a3961b8 (zones_lock){+...}, at: spin_lock_bh include/linux/spinlock.h:343 [inline] #1: ffffffff8a3961b8 (zones_lock){+...}, at: tcf_ct_flow_table_get+0xa3/0x1700 net/sched/act_ct.c:67 Preemption disabled at: [<0000000000000000>] 0x0 CPU: 0 PID: 9582 Comm: syz-executor610 Not tainted 5.6.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 ___might_sleep.cold+0x1f4/0x23d kernel/sched/core.c:6798 slab_pre_alloc_hook mm/slab.h:565 [inline] slab_alloc_node mm/slab.c:3227 [inline] kmem_cache_alloc_node_trace+0x272/0x790 mm/slab.c:3593 __do_kmalloc_node mm/slab.c:3615 [inline] __kmalloc_node+0x38/0x60 mm/slab.c:3623 kmalloc_node include/linux/slab.h:578 [inline] kvmalloc_node+0x61/0xf0 mm/util.c:574 kvmalloc include/linux/mm.h:645 [inline] kvzalloc include/linux/mm.h:653 [inline] bucket_table_alloc+0x8b/0x480 lib/rhashtable.c:175 rhashtable_init+0x3d2/0x750 lib/rhashtable.c:1054 nf_flow_table_init+0x16d/0x310 net/netfilter/nf_flow_table_core.c:498 tcf_ct_flow_table_get+0xe33/0x1700 net/sched/act_ct.c:82 tcf_ct_init+0xba4/0x18a6 net/sched/act_ct.c:1050 tcf_action_init_1+0x697/0xa20 net/sched/act_api.c:945 tcf_action_init+0x1e9/0x2f0 net/sched/act_api.c:1001 tcf_action_add+0xdb/0x370 net/sched/act_api.c:1411 tc_ctl_action+0x366/0x456 net/sched/act_api.c:1466 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5440 netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2478 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6b9/0x7d0 net/socket.c:2343 ___sys_sendmsg+0x100/0x170 net/socket.c:2397 __sys_sendmsg+0xec/0x1b0 net/socket.c:2430 do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4403d9 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffd719af218 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004403d9 RDX: 0000000000000000 RSI: 0000000020000300 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 0000000000000005 R09: 00000000004002c8 R10: 0000000000000008 R11: 00000000000 Fixes: c34b961a ("net/sched: act_ct: Create nf flow table per zone") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paul Blakey <paulb@mellanox.com> Cc: Jiri Pirko <jiri@mellanox.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
The rmnet_vnd_setup(), which is the callback of ->ndo_start_xmit() is allowed to call concurrently because it uses RCU protected data. So, it doesn't need tx lock. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Taehee Yoo says: ==================== bareudp: several code cleanup for bareudp module This patchset is to cleanup bareudp module code. 1. The first patch is to add module alias In the current bareudp code, there is no module alias. So, RTNL couldn't load bareudp module automatically. 2. The second patch is to add extack message. The extack error message is useful for noticing specific errors when command is failed. 3. The third patch is to remove unnecessary udp_encap_enable(). In the bareudp_socket_create(), udp_encap_enable() is called. But, the it's already called in the setup_udp_tunnel_sock(). So, it could be removed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
In the current code, udp_encap_enable() is called in bareudp_socket_create(). But, setup_udp_tunnel_sock() internally calls udp_encap_enable(). So, udp_encap_enable() is unnecessary. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
When bareudp netlink command fails, it doesn't print any error message. So, users couldn't know the exact reason. In order to tell the exact reason to the user, the extack error message is used in this patch. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
In the current bareudp code, there is no module alias. So, RTNL couldn't load bareudp module automatically. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Rohit Maheshwari says: ==================== cxgb4/chcr: ktls tx ofld support on T6 adapter This series of patches add support for kernel tls offload in Tx direction, over Chelsio T6 NICs. SKBs marked as decrypted will be treated as tls plain text packets and then offloaded to encrypt using network device (chelsio T6 adapter). This series is broken down as follows: Patch 1 defines a new macro and registers tls_dev_add and tls_dev_del callbacks. When tls_dev_add gets called we send a connection request to our hardware and to make HW understand about tls offload. Its a partial connection setup and only ipv4 part is done. Patch 2 handles the HW response of the connection request and then we request to update TCB and handle it's HW response as well. Also we save crypto key locally. Only supporting TLS_CIPHER_AES_GCM_128_KEY_SIZE. Patch 3 handles tls marked skbs (decrypted bit set) and sends it to ULD for crypto handling. This code has a minimal portion of tx handler, to handle only one complete record per skb. Patch 4 hanldes partial end part of records. Also added logic to handle multiple records in one single skb. It also adds support to send out tcp option(/s) if exists in skb. If a record is partial but has end part of a record, we'll fetch complete record and then only send it to HW to generate HASH on complete record. Patch 5 handles partial first or middle part of record, it uses AES_CTR to encrypt the partial record. If we are trying to send middle record, it's start should be 16 byte aligned, so we'll fetch few earlier bytes from the record and then send it to HW for encryption. Patch 6 enables ipv6 support and also includes ktls startistics. v1->v2: - mark tcb state to close in tls_dev_del. - u_ctx is now picked from adapter structure. - clear atid in case of failure. - corrected ULP_CRYPTO_KTLS_INLINE value. - optimized tcb update using control queue. - state machine handling when earlier states received. - chcr_write_cpl_set_tcb_ulp function is shifted to patch3. - un-necessary updating left variable. v2->v3: - add empty line after variable declaration. - local variable declaration in reverse christmas tree ordering. v3->v4: - replaced kfree_skb with dev_kfree_skb_any. - corrected error message reported by kbuild test robot <lkp@intel.com> - mss calculation logic. - correct place for Alloc skb check. - Replaced atomic_t with atomic64_t - added few more statistics counters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rohit Maheshwari authored
Adding ipv6 support and ktls related statistics. v1->v2: - added blank lines at 2 places. v3->v4: - Replaced atomic_t with atomic64_t - added few necessary stat counters. Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rohit Maheshwari authored
This patch contains handling of first part or middle part of the record. When we get a middle record, we will fetch few already sent bytes to make packet start 16 byte aligned. And if the packet has only the header part, we don't need to send it for packet encryption, send that packet as a plaintext. v1->v2: - un-necessary updating left variable. v3->v4: - replaced kfree_skb with dev_kfree_skb_any. Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rohit Maheshwari authored
TCP segment can chop a record in any order. Record can either be complete or it can be partial (first part which contains header, middle part which doesn't have header or TAG, and the end part which contains TAG. This patch handles partial end part of a tx record. In case of partial end part's, driver will send complete record to HW, so that HW will calculate GHASH (TAG) of complete packet. Also added support to handle multiple records in a segment. v1->v2: - miner change in calling chcr_write_cpl_set_tcb_ulp. - no need of checking return value of chcr_ktls_write_tcp_options. v3->v4: - replaced kfree_skb with dev_kfree_skb_any. Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rohit Maheshwari authored
Added tx handling in this patch. This includes handling of segments contain single complete record. v1->v2: - chcr_write_cpl_set_tcb_ulp is added in this patch. v3->v4: - mss calculation logic. - replaced kfree_skb with dev_kfree_skb_any. - corrected error message reported by kbuild test robot <lkp@intel.com> Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rohit Maheshwari authored
As part of this patch generated and saved crypto keys, handled HW response of act_open_req and set_tcb_req. Defined connection state update. v1->v2: - optimized tcb update using control queue. - state machine handling when earlier states received. v2->v3: - Added one empty line after function declaration. Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rohit Maheshwari authored
A new macro is defined to enable ktls tx offload support on Chelsio T6 adapter. And if this macro is enabled, cxgb4 will send mailbox to enable or disable ktls settings on HW. In chcr, enabled tx offload flag in netdev and registered tls_dev_add and tls_dev_del. v1->v2: - mark tcb state to close in tls_dev_del. - u_ctx is now picked from adapter structure. - clear atid in case of failure. - corrected ULP_CRYPTO_KTLS_INLINE value. v2->v3: - add empty line after variable declaration. - local variable declaration in reverse christmas tree ordering. Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jiri Pirko says: ==================== net: allow user specify TC action HW stats type Currently, when user adds a TC action and the action gets offloaded, the user expects the HW stats to be counted and included in stats dump. However, since drivers may implement different types of counting, there is no way to specify which one the user is interested in. For example for mlx5, only delayed counters are available as the driver periodically polls for updated stats. In case of mlxsw, the counters are queried on dump time. However, the HW resources for this type of counters is quite limited (couple of thousands). This limits the amount of supported offloaded filters significantly. Without counter assigned, the HW is capable to carry millions of those. On top of that, mlxsw HW is able to support delayed counters as well in greater numbers. That is going to be added in a follow-up patch. This patchset allows user to specify one of the following types of HW stats for added action: immediate - queried during dump time delayed - polled from HW periodically or sent by HW in async manner disabled - no stats needed Note that if "hw_stats" option is not passed, user does not care about the type, just expects any type of stats. Examples: $ tc filter add dev enp0s16np28 ingress proto ip handle 1 pref 1 flower skip_sw dst_ip 192.168.1.1 action drop hw_stats disabled $ tc -s filter show dev enp0s16np28 ingress filter protocol ip pref 1 flower chain 0 filter protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 dst_ip 192.168.1.1 skip_sw in_hw in_hw_count 2 action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 installed 7 sec used 2 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 hw_stats disabled $ tc filter add dev enp0s16np28 ingress proto ip handle 1 pref 1 flower skip_sw dst_ip 192.168.1.1 action drop hw_stats immediate $ tc -s filter show dev enp0s16np28 ingress filter protocol ip pref 1 flower chain 0 filter protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 dst_ip 192.168.1.1 skip_sw in_hw in_hw_count 2 action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 installed 11 sec used 4 sec Action statistics: Sent 102 bytes 1 pkt (dropped 1, overlimits 0 requeues 0) Sent software 0 bytes 0 pkt Sent hardware 102 bytes 1 pkt backlog 0b 0p requeues 0 hw_stats immediate ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Currently, user who is adding an action expects HW to report stats, however it does not have exact expectations about the stats types. That is aligned with TCA_ACT_HW_STATS_TYPE_ANY. Allow user to specify the type of HW stats for an action and require it. Pass the information down to flow_offload layer. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Introduce new type for disabled HW stats and allow the value in mlxsw offload. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Set a flag in case rule counter was created. Only query the device for stats of a rule, which has the valid counter assigned. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Introduce new type for delayed HW stats and allow the value in mlx5 offload. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Introduce new type for immediate HW stats and allow the value in mlxsw offload. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Currently don't allow actions with any other type to be inserted. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
As there is one set of counters for the whole action chain, forbid to mix the HW stats types. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Introduce flow_action_basic_hw_stats_types_check() helper and use it in drivers. That sanitizes the drivers which do not have support for action HW stats types. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Instead of directly checking number of action entries, use flow_offload_has_one_action() helper. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Initially, pass "ANY" (struct is zeroed) to the drivers as that is the current implicit value coming down to flow_offload. Add a bool indicating that entries have mixed HW stats type. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 07 Mar, 2020 10 commits
-
-
David S. Miller authored
Jakub Kicinski says: ==================== ethtool: consolidate irq coalescing - other drivers Convert more drivers following the groundwork laid in a recent patch set [1]. The aim of the effort is to consolidate irq coalescing parameter validation in the core. This set converts all the drivers outside of drivers/net/ethernet. Only vmxnet3 them was checking unsupported parameters. The aim is to merge this via the net-next tree so we can convert all drivers and make the checking mandatory. [1] https://lore.kernel.org/netdev/20200305051542.991898-1-kuba@kernel.org/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver correctly rejects all unsupported parameters. As a side effect of these changes the error code for unsupported params changes from EINVAL to EOPNOTSUPP. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Set ethtool_ops->supported_coalesce_params to let the core reject unsupported coalescing parameters. This driver did not previously reject unsupported parameters. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ansuel Smith authored
Add documentations for ipq806x mdio driver. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ansuel Smith authored
Currently ipq806x soc use generic bitbang driver to comunicate with the gmac ethernet interface. Add a dedicated driver created by chunkeey to fix this. Co-developed-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 06 Mar, 2020 6 commits
-
-
David S. Miller authored
Michal Kubecek says: ==================== tun: debug messages cleanup While testing ethtool output for "strange" devices, I noticed confusing and obviously incorrect message level information for a tun device and sent a quick fix. The result of the upstream discussion was that tun driver would rather deserve a more complex cleanup of the way it handles debug messages. The main problem is that all debugging statements and setting of message level are controlled by TUN_DEBUG macro which is only defined if one edits the source and rebuilds the module, otherwise all DBG1() and tun_debug() statements do nothing. This series drops the TUN_DEBUG switch and replaces custom tun_debug() macro with standard netif_info() so that message level (mask) set and displayed using ethtool works as expected. Some debugging messages are dropped as they only notify about entering a function which can be done easily using ftrace or kprobe. Patch 1 is a trivial fix for compilation warning with W=1. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Kubecek authored
TUN_DEBUG and tun_debug() are no longer used anywhere, drop them. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Kubecek authored
The tun driver uses custom macro tun_debug() which is only available if TUN_DEBUG is set. Replace it by standard netif_ifinfo(). For that purpose, rename tun_struct::debug to msg_enable and make it u32 and always present. Finally, make tun_get_msglevel(), tun_set_msglevel() and TUNSETDEBUG ioctl independent of TUN_DEBUG. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Kubecek authored
Some of the tun_debug() statements only inform us about entering a function which can be easily achieved with ftrace or kprobe. As tun_debug() is no-op unless TUN_DEBUG is set which requires editing the source and recompiling, setting up ftrace or kprobe is easier. Drop these debug statements. Also drop the tun_debug() statement informing about SIOCSIFHWADDR ioctl. We can monitor these through rtnetlink and it makes little sense to log address changes through ioctl but not changes through rtnetlink. Moreover, this tun_debug() is called even if the actual address change fails which makes it even less useful. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Kubecek authored
This macro is no-op unless TUN_DEBUG is defined (which requires editing and recompiling the source) and only does something if variable debug is 2 but that variable is zero initialized and never set to anything else. Moreover, the only use of the macro informs about entering function tun_chr_open() which can be easily achieved using ftrace or kprobe. Drop DBG1() macro, its only use and global variable debug. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Kubecek authored
The comment above tun_flow_save_rps_rxhash() starts with "/**" which makes it look like kerneldoc comment and results in warnings when building with W=1. Fix the format to make it look like a normal comment. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
-