- 10 Jun, 2017 3 commits
-
-
David S. Miller authored
Yuval Mintz says: ==================== bnx2x: Fix malicious VFs indication It was discovered that for a VF there's a simple [yet uncommon] scenario which would cause device firmware to declare that VF as malicious - Add a vlan interface on top of a VF and disable txvlan offloading for that VF [causing VF to transmit packets where vlan is on payload]. Patch #1 corrects driver transmission to prevent this issue. Patch #2 is a by-product correcting PF behavior once a VF is declared malicious. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mintz, Yuval authored
Once firmware indicates that a given VF is malicious and until that VF passes an FLR all bets are off - PF can't know anything is happening to the VF [since VF can't communicate anything to its PF]. But PF is currently still periodically asking device to collect statistics for the VF which might in turn fill logs by IOMMU blocking memory access done by the VF's PCI function [in the case VF has unmapped its buffers]. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mintz, Yuval authored
VF clients are configured as enforced, meaning firmware is validating the correctness of their ethertype/vid during transmission. Once txvlan is disabled, VF would start getting SKBs for transmission here vlan is on the payload - but it'll pass the packet's ethertype instead of the vid, leading to firmware declaring it as malicious. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 09 Jun, 2017 14 commits
-
-
David S. Miller authored
Merge tag 'linux-can-fixes-for-4.12-20170609' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2017-06-09 this is a pull request of 6 patches for net/master. There's a patch by Stephane Grosjean that fixes an uninitialized symbol warning in the peak_canfd driver. A patch by Johan Hovold to fix the product-id endianness in an error message in the the peak_usb driver. A patch by Oliver Hartkopp to enable CAN FD for virtual CAN devices by default. Three patches by me, one makes the helper function can_change_state() robust to be called with cf == NULL. The next patch fixes a memory leak in the gs_usb driver. And the last one fixes a lockdep splat by properly initialize the per-net can_rcvlists_lock spin_lock. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
The change to remove free_netdev() from ieee80211_if_free() erroneously didn't add the necessary free_netdev() for when ieee80211_if_free() is called directly in one place, rather than as the priv_destructor. Add the missing call. Fixes: cf124db5 ("net: Fix inconsistent teardown and release of private netdev state.") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
ashwanth@codeaurora.org authored
IPI's from the victim cpu are not handled in dev_cpu_callback. So these pending IPI's would be sent to the remote cpu only when NET_RX is scheduled on the victim cpu and since this trigger is unpredictable it would result in packet latencies on the remote cpu. This patch add support to send the pending ipi's of victim cpu. Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mario Molitor authored
1.) Bugfix of function stmmac_get_tx_hwtstamp. Corrected the tx timestamp available check (same as 4.8 and older) Change printout from info syslevel to debug. 2.) Bugfix of function stmmac_get_rx_hwtstamp. Corrected the rx timestamp available check (same as 4.8 and older) Change printout from info syslevel to debug. Fixes: ba1ffd74 ("stmmac: fix PTP support for GMAC4") Signed-off-by: Mario Molitor <mario_molitor@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mario Molitor authored
According the CYCLON V documention only the bit 16 of snaptypesel should set. (more information see Table 17-20 (cv_5v4.pdf) : Timestamp Snapshot Dependency on Register Bits) Fixes: d2042052 ("stmmac: update the PTP header file") Signed-off-by: Mario Molitor <mario_molitor@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Krister Johansen authored
It looks like this: Message from syslogd@flamingo at Apr 26 00:45:00 ... kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4 They seem to coincide with net namespace teardown. The message is emitted by netdev_wait_allrefs(). Forced a kdump in netdev_run_todo, but found that the refcount on the lo device was already 0 at the time we got to the panic. Used bcc to check the blocking in netdev_run_todo. The only places where we're off cpu there are in the rcu_barrier() and msleep() calls. That behavior is expected. The msleep time coincides with the amount of time we spend waiting for the refcount to reach zero; the rcu_barrier() wait times are not excessive. After looking through the list of callbacks that the netdevice notifiers invoke in this path, it appears that the dst_dev_event is the most interesting. The dst_ifdown path places a hold on the loopback_dev as part of releasing the dev associated with the original dst cache entry. Most of our notifier callbacks are straight-forward, but this one a) looks complex, and b) places a hold on the network interface in question. I constructed a new bcc script that watches various events in the liftime of a dst cache entry. Note that dst_ifdown will take a hold on the loopback device until the invalidated dst entry gets freed. [ __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183 __dst_free rcu_nocb_kthread kthread ret_from_fork Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mateusz Jurczyk authored
Verify that the caller-provided sockaddr structure is large enough to contain the sa_family field, before accessing it in bind() and connect() handlers of the AF_UNIX socket. Since neither syscall enforces a minimum size of the corresponding memory region, very short sockaddrs (zero or one byte long) result in operating on uninitialized memory while referencing .sa_family. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Fixes: 0d7e2d21 ("IB/ipoib: add get_link_ksettings in ethtool") Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oliver Hartkopp authored
CAN FD capable CAN interfaces can handle (classic) CAN 2.0 frames too. New users usually fail at their first attempt to explore CAN FD on virtual CAN interfaces due to the current CAN_MTU default. Set the MTU to CANFD_MTU by default to reduce this confusion. If someone *really* needs a 'classic CAN'-only device this can be set with the 'ip' tool with e.g. 'ip link set vcan0 mtu 16' as before. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Marc Kleine-Budde authored
This patch uses spin_lock_init() instead of __SPIN_LOCK_UNLOCKED() to initialize the per namespace net->can.can_rcvlists_lock lock to fix this lockdep warning: | INFO: trying to register non-static key. | the code is fine but needs lockdep annotation. | turning off the locking correctness validator. | CPU: 0 PID: 186 Comm: candump Not tainted 4.12.0-rc3+ #47 | Hardware name: Marvell Kirkwood (Flattened Device Tree) | [<c0016644>] (unwind_backtrace) from [<c00139a8>] (show_stack+0x18/0x1c) | [<c00139a8>] (show_stack) from [<c0058c8c>] (register_lock_class+0x1e4/0x55c) | [<c0058c8c>] (register_lock_class) from [<c005bdfc>] (__lock_acquire+0x148/0x1990) | [<c005bdfc>] (__lock_acquire) from [<c005deec>] (lock_acquire+0x174/0x210) | [<c005deec>] (lock_acquire) from [<c04a6780>] (_raw_spin_lock+0x50/0x88) | [<c04a6780>] (_raw_spin_lock) from [<bf02116c>] (can_rx_register+0x94/0x15c [can]) | [<bf02116c>] (can_rx_register [can]) from [<bf02a868>] (raw_enable_filters+0x60/0xc0 [can_raw]) | [<bf02a868>] (raw_enable_filters [can_raw]) from [<bf02ac14>] (raw_enable_allfilters+0x2c/0xa0 [can_raw]) | [<bf02ac14>] (raw_enable_allfilters [can_raw]) from [<bf02ad38>] (raw_bind+0xb0/0x250 [can_raw]) | [<bf02ad38>] (raw_bind [can_raw]) from [<c03b5fb8>] (SyS_bind+0x70/0xac) | [<c03b5fb8>] (SyS_bind) from [<c000f8c0>] (ret_fast_syscall+0x0/0x1c) Cc: Mario Kicherer <dev@kicherer.org> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Marc Kleine-Budde authored
This patch adds the missing kfree() in gs_cmd_reset() to free the memory that is not used anymore after usb_control_msg(). Cc: linux-stable <stable@vger.kernel.org> Cc: Maximilian Schneider <max@schneidersoft.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Johan Hovold authored
Make sure to use the USB device product-id stored in host-byte order in a probe error message. Also remove a redundant reassignment of the local usb_dev variable which had already been used to retrieve the product id. Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Stephane Grosjean authored
This patch fixes two uninitialized symbol warnings in the new code adding support of the PEAK-System PCAN-PCI Express FD boards, in the socket-CAN network protocol family. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Marc Kleine-Budde authored
In OOM situations where no skb can be allocated, can_change_state() may be called with cf == NULL. As this function updates the state and error statistics it's not an option to skip the call to can_change_state() in OOM situations. This patch makes can_change_state() robust, so that it can be called with cf == NULL. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
- 08 Jun, 2017 21 commits
-
-
David Ahern authored
Commit 1aa6c4f6 ("net: vrf: Add l3mdev rules on first device create") adds the l3mdev FIB rule the first time a VRF device is created. However, it only creates the rule once and only in the namespace the first device is created - which may not be init_net. Fix by using the net_generic capability to make the add_fib_rules flag per network namespace. Fixes: 1aa6c4f6 ("net: vrf: Add l3mdev rules on first device create") Reported-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
I noticed that test_l4lb was failing in selftests: # ./test_progs test_pkt_access:PASS:ipv4 77 nsec test_pkt_access:PASS:ipv6 44 nsec test_xdp:PASS:ipv4 2933 nsec test_xdp:PASS:ipv6 1500 nsec test_l4lb:PASS:ipv4 377 nsec test_l4lb:PASS:ipv6 544 nsec test_l4lb:FAIL:stats 6297600000 200000 test_tcp_estats:PASS: 0 nsec Summary: 7 PASSED, 1 FAILED Tracking down the issue actually revealed that endianness selection in bpf_endian.h is broken when compiled with clang with bpf target. test_pkt_access.c, test_l4lb.c is compiled with __BYTE_ORDER as __BIG_ENDIAN, test_xdp.c as __LITTLE_ENDIAN! test_l4lb noticeably fails, because the test accounts bytes via bpf_ntohs(ip6h->payload_len) and bpf_ntohs(iph->tot_len), and compares them against a defined value and given a wrong endianness, the test outcome is different, of course. Turns out that there are actually two bugs: i) when we do __BYTE_ORDER comparison with __LITTLE_ENDIAN/__BIG_ENDIAN, then depending on the include order we see different outcomes. Reason is that __BYTE_ORDER is undefined due to missing endian.h include. Before we include the asm/byteorder.h (e.g. through linux/in.h), then __BYTE_ORDER equals __LITTLE_ENDIAN since both are undefined, after the include which correctly pulls in linux/byteorder/little_endian.h, __LITTLE_ENDIAN is defined, but given __BYTE_ORDER is still undefined, we match on __BYTE_ORDER equals to __BIG_ENDIAN since __BIG_ENDIAN is also undefined at that point, sigh. ii) But even that would be wrong, since when compiling the test cases with clang, one can select between bpfeb and bpfel targets for cross compilation. Hence, we can also not rely on what the system's endian.h provides, but we need to look at the compiler's defined endianness. The compiler defines __BYTE_ORDER__, and we can match __ORDER_LITTLE_ENDIAN__ and __ORDER_BIG_ENDIAN__, which also reflects targets bpf (native), bpfel, bpfeb correctly, thus really only rely on that. After patch: # ./test_progs test_pkt_access:PASS:ipv4 74 nsec test_pkt_access:PASS:ipv6 42 nsec test_xdp:PASS:ipv4 2340 nsec test_xdp:PASS:ipv6 1461 nsec test_l4lb:PASS:ipv4 400 nsec test_l4lb:PASS:ipv6 530 nsec test_tcp_estats:PASS: 0 nsec Summary: 7 PASSED, 0 FAILED Fixes: 43bcf707 ("bpf: fix _htons occurences in test_progs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
Each time a new speed is added, the bonding 802.3ad isn't updated. Add a comment to remind the developer to update this driver. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
This patch adds 14 Gbps enum definition, and fixes aggregated bandwidth calculation based on above slave links. Fixes: 0d7e2d21 ("IB/ipoib: add get_link_ksettings in ethtool") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Thibaut Collet authored
This patch adds [5|50] Gbps enum definition, and fixes aggregated bandwidth calculation based on above slave links. Fixes: c9a70d43 ("net-next: ethtool: Added port speed macros.") Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
The first netlink attribute (value 0) must always be defined as none/unspec. Because we cannot change an existing UAPI, I add a comment to point the mistake and avoid to propagate it in a new ovs API in the future. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
While discussing the possible merits of clang warning about unused initialized functions, I found one function that was clearly meant to be called but never actually is. __ila_hash_secret_init() initializes the hash value for the ila locator, apparently this is intended to prevent hash collision attacks, but this ends up being a read-only zero constant since there is no caller. I could find no indication of why it was never called, the earliest patch submission for the module already was like this. If my interpretation is right, we certainly want to backport the patch to stable kernels as well. I considered adding it to the ila_xlat_init callback, but for best effect the random data is read as late as possible, just before it is first used. The underlying net_get_random_once() is already highly optimized to avoid overhead when called frequently. Fixes: 7f00feaf ("ila: Add generic ILA translation facility") Cc: stable@vger.kernel.org Link: https://www.spinics.net/lists/kernel/msg2527243.htmlSigned-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function ‘rtw_cfg80211_add_monitor_if’: drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:2670:10: error: ‘struct net_device’ has no member named ‘destructor’ mon_ndev->destructor = rtw_ndev_destructor; ^ Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Stephen Hemminger says: ==================== netvsc: bug fixes These are bugfixes for netvsc driver in 4.12. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
The work queue and handling of network filter parameters should be in rndis_device. This gets rid of warning from RCU checks, eliminates a race and cleans up code. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
The ndo_poll_controller function needs to schedule NAPI to pick up arriving packets and send completions. Otherwise no data will ever be received. For simple case of netconsole, it also will allow send completions to happen. Without this netpoll will eventually get stuck. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
The ethtool info command calls the netvsc get_sset_count with RTNL but not with RCU. Which causes warning: drivers/net/hyperv/netvsc_drv.c:1010 suspicious rcu_dereference_check() usage! Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Roopa reported attempts to delete a bond device that is referenced in a multipath route is hanging: $ ifdown bond2 # ifupdown2 command that deletes virtual devices unregister_netdevice: waiting for bond2 to become free. Usage count = 2 Steps to reproduce: echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown ip link add dev bond12 type bond ip link add dev bond13 type bond ip addr add 2001:db8:2::0/64 dev bond12 ip addr add 2001:db8:3::0/64 dev bond13 ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2 ip link del dev bond12 ip link del dev bond13 The root cause is the recent change to keep routes on a linkdown. Update the check to detect when the device is unregistering and release the route for that case. Fixes: a1a22c12 ("net: ipv6: Keep nexthop of multipath route on admin down") Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mintz, Yuval authored
Some of the structure's fields are not initialized by the rtnetlink. If driver doesn't set those in ndo_get_vf_config(), they'd leak memory to user. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> CC: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mateusz Jurczyk authored
Verify that the length of the socket buffer is sufficient to cover the nlmsghdr structure before accessing the nlh->nlmsg_len field for further input sanitization. If the client only supplies 1-3 bytes of data in sk_buff, then nlh->nlmsg_len remains partially uninitialized and contains leftover memory from the corresponding kernel allocation. Operating on such data may result in indeterminate evaluation of the nlmsg_len < sizeof(*nlh) expression. The bug was discovered by a runtime instrumentation designed to detect use of uninitialized memory in the kernel. The patch prevents this and other similar tools (e.g. KMSAN) from flagging this behavior in the future. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
This reverts commit 85eac2ba. There is an updated version of this fix which we should use instead. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christian Lamparter authored
emac_mdio_read_link() was not copying the requested phy settings back into the emac driver's own phy api. This has caused a link speed mismatch issue for the AR8035 as the emac driver kept trying to connect with 10/100MBps on a 1GBit/s link. This patch also unifies shared code between emac_setup_aneg() and emac_mdio_setup_forced(). And furthermore it removes a chunk of emac_mdio_init_phy(), that was copying the same data into itself. Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christian Lamparter authored
This patch fixes a problem where the AR8035 PHY can't be detected on an Cisco Meraki MR24, if the ethernet cable is not connected on boot. Russell Senior provided steps to reproduce the issue: |Disconnect ethernet cable, apply power, wait until device has booted, |plug in ethernet, check for interfaces, no eth0 is listed. | |This appears to be a problem during probing of the AR8035 Phy chip. |When ethernet has no link, the phy detection fails, and eth0 is not |created. Plugging ethernet later has no effect, because there is no |interface as far as the kernel is concerned. The relevant part of |the boot log looks like this: |this is the failing case: | |[ 0.876611] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode |[ 0.882532] /plb/opb/ethernet@ef600c00: reset timeout |[ 0.888546] /plb/opb/ethernet@ef600c00: can't find PHY! |and the succeeding case: | |[ 0.876672] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode |[ 0.883952] eth0: EMAC-0 /plb/opb/ethernet@ef600c00, MAC 00:01:.. |[ 0.890822] eth0: found Atheros 8035 Gigabit Ethernet PHY (0x01) Based on the comment and the commit message of commit 23fbb5a8 ("emac: Fix EMAC soft reset on 460EX/GT"). This is because the AR8035 PHY doesn't provide the TX Clock, if the ethernet cable is not attached. This causes the reset to timeout and the PHY detection code in emac_init_phy() is unable to detect the AR8035 PHY. As a result, the emac driver bails out early and the user left with no ethernet. In order to stay compatible with existing configurations, the driver tries the current reset approach at first. Only if the first attempt timed out, it does perform one more retry with the clock temporarily switched to the internal source for just the duration of the reset. LEDE-Bug: #687 <https://bugs.lede-project.org/index.php?do=details&task_id=687> Cc: Chris Blake <chrisrblake93@gmail.com> Reported-by: Russell Senior <russell@personaltelco.net> Fixes: 23fbb5a8 ("emac: Fix EMAC soft reset on 460EX/GT") Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mateusz Jurczyk authored
Verify that the length of the socket buffer is sufficient to cover the entire nlh->nlmsg_len field before accessing that field for further input sanitization. If the client only supplies 1-3 bytes of data in sk_buff, then nlh->nlmsg_len remains partially uninitialized and contains leftover memory from the corresponding kernel allocation. Operating on such data may result in indeterminate evaluation of the nlmsg_len < sizeof(*nlh) expression. The bug was discovered by a runtime instrumentation designed to detect use of uninitialized memory in the kernel. The patch prevents this and other similar tools (e.g. KMSAN) from flagging this behavior in the future. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
> ../drivers/hsi/clients/ssi_protocol.c:1069:5: error: 'struct net_device' has no member named 'destructor' Reported-by: Mark Brown <broonie@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Rothwell authored
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 07 Jun, 2017 2 commits
-
-
David S. Miller authored
Network devices can allocate reasources and private memory using netdev_ops->ndo_init(). However, the release of these resources can occur in one of two different places. Either netdev_ops->ndo_uninit() or netdev->destructor(). The decision of which operation frees the resources depends upon whether it is necessary for all netdev refs to be released before it is safe to perform the freeing. netdev_ops->ndo_uninit() presumably can occur right after the NETDEV_UNREGISTER notifier completes and the unicast and multicast address lists are flushed. netdev->destructor(), on the other hand, does not run until the netdev references all go away. Further complicating the situation is that netdev->destructor() almost universally does also a free_netdev(). This creates a problem for the logic in register_netdevice(). Because all callers of register_netdevice() manage the freeing of the netdev, and invoke free_netdev(dev) if register_netdevice() fails. If netdev_ops->ndo_init() succeeds, but something else fails inside of register_netdevice(), it does call ndo_ops->ndo_uninit(). But it is not able to invoke netdev->destructor(). This is because netdev->destructor() will do a free_netdev() and then the caller of register_netdevice() will do the same. However, this means that the resources that would normally be released by netdev->destructor() will not be. Over the years drivers have added local hacks to deal with this, by invoking their destructor parts by hand when register_netdevice() fails. Many drivers do not try to deal with this, and instead we have leaks. Let's close this hole by formalizing the distinction between what private things need to be freed up by netdev->destructor() and whether the driver needs unregister_netdevice() to perform the free_netdev(). netdev->priv_destructor() performs all actions to free up the private resources that used to be freed by netdev->destructor(), except for free_netdev(). netdev->needs_free_netdev is a boolean that indicates whether free_netdev() should be done at the end of unregister_netdevice(). Now, register_netdevice() can sanely release all resources after ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit() and netdev->priv_destructor(). And at the end of unregister_netdevice(), we invoke netdev->priv_destructor() and optionally call free_netdev(). Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Will reported that in BPF_XADD we must use a different register in stxr instruction for the status flag due to otherwise CONSTRAINED UNPREDICTABLE behavior per architecture. Reference manual says [1]: If s == t, then one of the following behaviors must occur: * The instruction is UNDEFINED. * The instruction executes as a NOP. * The instruction performs the store to the specified address, but the value stored is UNKNOWN. Thus, use a different temporary register for the status flag to fix it. Disassembly extract from test 226/STX_XADD_DW from test_bpf.ko: [...] 0000003c: c85f7d4b ldxr x11, [x10] 00000040: 8b07016b add x11, x11, x7 00000044: c80c7d4b stxr w12, x11, [x10] 00000048: 35ffffac cbnz w12, 0x0000003c [...] [1] https://static.docs.arm.com/ddi0487/b/DDI0487B_a_armv8_arm.pdf, p.6132 Fixes: 85f68fe8 ("bpf, arm64: implement jiting of BPF_XADD") Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-