- 10 Oct, 2017 4 commits
-
-
-
Wei Wang authored
This patch replaces rcu_deference() with rcu_dereference_bh() in ipv6_route_seq_next() to avoid the following warning: [ 19.431685] WARNING: suspicious RCU usage [ 19.433451] 4.14.0-rc3-00914-g66f5d6ce #118 Not tainted [ 19.435509] ----------------------------- [ 19.437267] net/ipv6/ip6_fib.c:2259 suspicious rcu_dereference_check() usage! [ 19.440790] [ 19.440790] other info that might help us debug this: [ 19.440790] [ 19.444734] [ 19.444734] rcu_scheduler_active = 2, debug_locks = 1 [ 19.447757] 2 locks held by odhcpd/3720: [ 19.449480] #0: (&p->lock){+.+.}, at: [<ffffffffb1231f7d>] seq_read+0x3c/0x333 [ 19.452720] #1: (rcu_read_lock_bh){....}, at: [<ffffffffb1d2b984>] ipv6_route_seq_start+0x5/0xfd [ 19.456323] [ 19.456323] stack backtrace: [ 19.458812] CPU: 0 PID: 3720 Comm: odhcpd Not tainted 4.14.0-rc3-00914-g66f5d6ce #118 [ 19.462042] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 19.465414] Call Trace: [ 19.466788] dump_stack+0x86/0xc0 [ 19.468358] lockdep_rcu_suspicious+0xea/0xf3 [ 19.470183] ipv6_route_seq_next+0x71/0x164 [ 19.471963] seq_read+0x244/0x333 [ 19.473522] proc_reg_read+0x48/0x67 [ 19.475152] ? proc_reg_write+0x67/0x67 [ 19.476862] __vfs_read+0x26/0x10b [ 19.478463] ? __might_fault+0x37/0x84 [ 19.480148] vfs_read+0xba/0x146 [ 19.481690] SyS_read+0x51/0x8e [ 19.483197] do_int80_syscall_32+0x66/0x15a [ 19.484969] entry_INT80_compat+0x32/0x50 [ 19.486707] RIP: 0023:0xf7f0be8e [ 19.488244] RSP: 002b:00000000ffa75d04 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 19.491431] RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 0000000008056068 [ 19.493886] RDX: 0000000000001000 RSI: 0000000008056008 RDI: 0000000000001000 [ 19.496331] RBP: 00000000000001ff R08: 0000000000000000 R09: 0000000000000000 [ 19.498768] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 19.501217] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Fixes: 66f5d6ce ("ipv6: replace rwlock with rcu and spinlock in fib6_table") Reported-by: Xiaolong Ye <xiaolong.ye@intel.com> Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Linus Torvalds authored
Merge powerpc transactional memory fixes from Michael Ellerman: "I figured I'd still send you the commits using a bundle to make sure it works in case I need to do it again in future" This fixes transactional memory state restore for powerpc. * bundle'd patches from Michael Ellerman: powerpc/tm: Fix illegal TM state in signal handler powerpc/64s: Use emergency stack for kernel TM Bad Thing program checks
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller authored
Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-09 This series contains updates to i40e and i40evf only. Jake fixes missed flag conversion from u64 to u32. Fixes a deafult ITR value issue where the driver defaults to an ITR value of half the expected value (in terms of minimum microseconds between interrupts). So fix this by changing the default values to be calculated using the ITR_REG_TO_USEC() macro which indicates that we are converting from the register units into microseconds. Updates the drivers to bump the tail in increments of 8 and double the number of descriptors we will bundle into one tail bump when receiving. With the recent kernel support for enabling XPS and QoS at the same time, we no longer need to worry about the number of traffic classes when enabling XPS. Lihong converts the use of hash_for_each() to hash_for_each_safe() to safely remove a hash entry. Adds a check for the return value for find_first_bit() in the case that it returns the size passed to search. Alan fixes a bug in which filters are erroneously removed if they are removed and then added again. So make sure that when adding a filter, if we find it already existed in our list, make sure it is not marked to be removed. Jayaprakash adds the retrying of PHY reads when the I2C is busy for a maximum period of 500ms. Rami fixes code comment typo. Stefano Brivio simplifies the code by removing the use of a local return code variable and simply return the results of the read function. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 09 Oct, 2017 36 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller authored
Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2017-10-09 This series contains updates to ixgbe only. Emil fixes an issue where the semaphore bits could be stuck after a reset or a crash, by adding the clearing of software resource bits in the software/firmware synchronization register. Added error checks when we attempt to identify and initialize the PHY to prevent a crash. Fixed a few issues in the logic of ixgbe_clean_test_rings() which was exposed by a previous commit that was causing a crash in ethtool diagnostics. Bhumika Goyal fixes a couple of instances which were overlooked when we made ixgbe_mac_operations constant. Shannon Nelson fixes an issue to restore normal operations after the last MACVLAN offload is removed, otherwise we get stuck in a single queue operations. The infamous Jesper Dangaard Brouer adds a counter which counts the number of times the recycle fails and the real page allocator is invoked. Alex updates the adaptive ITR algorithm to better support the needs of the network. This attempt to make it so that our ITR algorithm will try to prevent either starving a socket buffer for memory in the case of transmit, or overrunning an receive socket buffer on receive. We should function better with new features like XDP which can handle small packets at high rates without needing to lock us into NAPI polling mode. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Fix object leak on IPSEC offload failure, from Steffen Klassert. 2) Fix range checks in ipset address range addition operations, from Jozsef Kadlecsik. 3) Fix pernet ops unregistration order in ipset, from Florian Westphal. 4) Add missing netlink attribute policy for nl80211 packet pattern attrs, from Peng Xu. 5) Fix PPP device destruction race, from Guillaume Nault. 6) Write marks get lost when BPF verifier processes R1=R2 register assignments, causing incorrect liveness information and less state pruning. Fix from Alexei Starovoitov. 7) Fix blockhole routes so that they are marked dead and therefore not cached in sockets, otherwise IPSEC stops working. From Steffen Klassert. 8) Fix broadcast handling of UDP socket early demux, from Paolo Abeni. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (37 commits) cdc_ether: flag the u-blox TOBY-L2 and SARA-U2 as wwan net: thunderx: mark expected switch fall-throughs in nicvf_main() udp: fix bcast packet reception netlink: do not set cb_running if dump's start() errs ipv4: Fix traffic triggered IPsec connections. ipv6: Fix traffic triggered IPsec connections. ixgbe: incorrect XDP ring accounting in ethtool tx_frame param net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag Revert commit 1a8b6d76 ("net:add one common config...") ixgbe: fix masking of bits read from IXGBE_VXLANCTRL register ixgbe: Return error when getting PHY address if PHY access is not supported netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of 'xt_bpf_info_v1' netfilter: SYNPROXY: skip non-tcp packet in {ipv4, ipv6}_synproxy_hook tipc: Unclone message at secondary destination lookup tipc: correct initialization of skb list gso: fix payload length when gso_size is zero mlxsw: spectrum_router: Avoid expensive lookup during route removal bpf: fix liveness marking doc: Fix typo "8023.ad" in bonding documentation ipv6: fix net.ipv6.conf.all.accept_dad behaviour for real ...
-
Aleksander Morgado authored
The u-blox TOBY-L2 is a LTE Cat 4 module with HSPA+ and 2G fallback. This module allows switching to different USB profiles with the 'AT+UUSBCONF' command, and provides a ECM network interface when the 'AT+UUSBCONF=2' profile is selected. The u-blox SARA-U2 is a HSPA module with 2G fallback. The default USB configuration includes a ECM network interface. Both these modules are controlled via AT commands through one of the TTYs exposed. Connecting these modules may be done just by activating the desired PDP context with 'AT+CGACT=1,<cid>' and then running DHCP on the ECM interface. Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stefano Brivio authored
Fixes: 09f79fd4 ("i40e: avoid NVM acquire deadlock during NVM update") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Rami Rosen authored
This patch fixes a typo in i40e_vsi_alloc_arrays() documentation. The first parameter name should be "vsi" instead of "type". Signed-off-by: Rami Rosen <rami.rosen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Lihong Yang authored
The computed result of I40E_MAX_VSI_QP * I40E_VIRTCHNL_SUPPORTED_QTYPES is used more than three times in function i40e_config_irq_link_list. Simply declare a local variable to store it to improve readability. Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jayaprakash Shanmugam authored
- When the I2C is busy, the PHY reads are delayed. The firmware will return EGAIN in these cases with an expectation that the SW will trigger the reads again - This patch retries the operation for a maximum period of 500ms Signed-off-by: Jayaprakash Shanmugam <jayaprakash.shanmugam@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Lihong Yang authored
The find_first_bit function will return the size passed to search if the first set bit is not found. This patch adds the check in case that happens as the return value would be used as the index in an array and that would have caused the out-of-bounds access. Detected by CoverityScan, CID 1295969 Out-of-bounds access Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Recently, the kernel gained support for enabling XPS and QoS at the same time. Thus, we no longer need to worry about the number of traffic classes when enabling XPS. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Double the number of descriptors we'll bundle into one tail bump when receiving. Empirical testing has shown that we reduce CPU utilization and don't appear to reduce throughput or packet rate. 32 seems to be the sweet spot, as it's half the default polling budget, so we'd essentially reduce from 4 tail writes when polling down to 2. Increasing this up to 64 appears to have negative impacts as it may become possible that we don't bump the tail each time we get polled, which could cause a long delay between returning descriptors to the hardware. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Hardware only fetches descriptors on cachelines of 8, essentially ignoring the lower 3 bits of the tail register. Thus, it is pointless to bump tail by an unaligned access as the hardware will ignore some of the new descriptors we allocated. Thus, it's ideal if we can ensure tail writes are always aligned to 8. At first, it seems like we'd already do this, since we allocate descriptors in batches which are a multiple of 8. Since we'd always increment by a multiple of 8, it seems like the value should always be aligned. However, this ignores allocation failures. If we fail to allocate a buffer, our tail register will become unaligned. Once it has become unaligned it will essentially be stuck unaligned until a buffer allocation happens to fail at the exact amount necessary to re-align it. We can do better, by simply rounding down the number of buffers we're about to allocate (cleaned_count) such that "next_to_clean + cleaned_count" is rounded to the nearest multiple of 8. We do this by calculating how far off that value is and subtracting it from the cleaned_count. This essentially defers allocation of buffers if they're going to be ignored by hardware anyways, and re-aligns our next_to_use and tail values after a failure to allocate a descriptor. This calculation ensures that we always align the tail writes in a way the hardware expects and don't unnecessarily allocate buffers which won't be fetched immediately. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
The lrxq thresh value tells hardware to immediately interrupt when there are fewer than N*64 packets left in the ring. Counter intuitively, empirical testing has shown that decreasing this value from 2 to 1, and thus changing from an immediate interrupt at fewer than 128 descriptors down to 64 descriptors causes a small increase in the maximum total packets per second we can receive. This increase occurs even when we're polling with interrupts masked, as the hardware must still handle interrupts internally even if we've disabled them in software. Also reduce the value for any VFs we allocate. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
In the past we changed driver behavior to not clear the PBA when re-enabling interrupts. This change was motivated by the flawed belief that clearing the PBA would cause a lost interrupt if a receive interrupt occurred while interrupts were disabled. According to empirical testing this isn't the case. Additionally, the data sheet specifically says that we should set the CLEARPBA bit when re-enabling interrupts in a polling setup. This reverts commit 40d72a50 ("i40e/i40evf: don't lose interrupts") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
The ITR register expects to be programmed in units of 2 microseconds. Because of this, all of the drivers I40E_ITR_* constants are in terms of this 2 microsecond register. Unfortunately, the rx_itr_default value is expected to be programmed in microseconds. Effectively the driver defaults to an ITR value of half the expected value (in terms of minimum microseconds between interrupts). Fix this by changing the default values to be calculated using ITR_REG_TO_USEC macro which indicates that we're converting from the register units into microseconds. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Alan Brady authored
Due to the asynchronous nature in which mac filters are added and deleted, there exists a bug in which filters are erroneously removed if removed then added again quickly. The events are as such: - filter marked for removal - same filter is re-added before watchdog that cleans up filters - we skip re-adding the filter because we have it already in the list - watchdog filter cleanup kicks off and filter is removed So when we were re-adding the same filter, it didn't actually get added because it already existed in the list, but was marked for removal and had yet to actually be removed. This patch fixes the issue by making sure that when adding a filter, if we find it already existing in our list, make sure it is not marked to be removed. Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Lihong Yang authored
This patch replaces hash_for_each function with hash_for_each_safe when calling __i40e_del_filter. The hash_for_each_safe function is the right one to use when iterating over a hash table to safely remove a hash entry. Otherwise, incorrect values may be read from freed memory. Detected by CoverityScan, CID 1402048 Read from pointer after free Signed-off-by: Lihong Yang <lihong.yang@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Since we don't yet have more than 32 flags, we'll use a u32 for both the hw_features and flag field. Should we gain more flags in the future, we may need to convert to a u64 or separate flags out into two fields. This was overlooked in the previous commit 2781de2134c4 ("i40e/i40evf: organize and re-number feature flags"), where the feature flag was not converted form u64 to u32. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds authored
Pull NFS client bugfixes from Trond Myklebust: "Hightlights include: stable fixes: - nfs/filelayout: fix oops when freeing filelayout segment - NFS: Fix uninitialized rpc_wait_queue bugfixes: - NFSv4/pnfs: Fix an infinite layoutget loop - nfs: RPC_MAX_AUTH_SIZE is in bytes" * tag 'nfs-for-4.14-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4/pnfs: Fix an infinite layoutget loop nfs/filelayout: fix oops when freeing filelayout segment sunrpc: remove redundant initialization of sock NFS: Fix uninitialized rpc_wait_queue NFS: Cleanup error handling in nfs_idmap_request_key() nfs: RPC_MAX_AUTH_SIZE is in bytes
-
David S. Miller authored
Eric Dumazet says: ==================== ipv6: addrlabel: avoid dirtying ip6addrlbl_entry The refcount on ip6addrlbl_entry is only used to make sure ip6addrlbl_entry does not disappear while ip6addrlbl_get() is allocating an skb. We can instead allocate skb first, then use RCU, so that we no longer need to refcount these structures. ==================== Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
After previous patch ("ipv6: addrlabel: rework ip6addrlbl_get()") we can remove the refcount from struct ip6addrlbl_entry, since it is no longer elevated in p6addrlbl_get() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
If we allocate skb before the lookup, we can use RCU without the need of ip6addrlbl_hold() This means that the following patch can get rid of refcounting. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gustavo A. R. Silva authored
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Cc: Sunil Goutham <sgoutham@cavium.com> Cc: Robert Richter <rric@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller authored
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree, they are: 1) Fix packet drops due to incorrect ECN handling in IPVS, from Vadim Fedorenko. 2) Fix splat with mark restoration in xt_socket with non-full-sock, patch from Subash Abhinov Kasiviswanathan. 3) ipset bogusly bails out when adding IPv4 range containing more than 2^31 addresses, from Jozsef Kadlecsik. 4) Incorrect pernet unregistration order in ipset, from Florian Westphal. 5) Races between dump and swap in ipset results in BUG_ON splats, from Ross Lagerwall. 6) Fix chain renames in nf_tables, from JingPiao Chen. 7) Fix race in pernet codepath with ebtables table registration, from Artem Savkov. 8) Memory leak in error path in set name allocation in nf_tables, patch from Arvind Yadav. 9) Don't dump chain counters if they are not available, this fixes a crash when listing the ruleset. 10) Fix out of bound memory read in strlcpy() in x_tables compat code, from Eric Dumazet. 11) Make sure we only process TCP packets in SYNPROXY hooks, patch from Lin Zhang. 12) Cannot load rules incrementally anymore after xt_bpf with pinned objects, added in revision 1. From Shmulik Ladkani. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queueDavid S. Miller authored
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2017-10-09 This series contains updates to ixgbe and arch/Kconfig. Mark fixes a case where PHY register access is not supported and we were returning a PHY address, when we should have been returning -EOPNOTSUPP. Sabrina Dubroca fixes the use of a logical "and" when it should have been the bitwise "and" operator. Ding Tianhong reverts the commit that added the Kconfig bool option ARCH_WANT_RELAX_ORDER, since there is now a new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING that has been added to indicate that Relaxed Ordering Attributes should not be used for Transaction Layer Packets. Then follows up with making the needed changes to ixgbe to use the new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flag. John Fastabend fixes an issue in the ring accounting when the transmit ring parameters are changed via ethtool when an XDP program is attached. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Tariq Toukan says: ==================== Fix mlx4 static checker warnings This patchset contains fixes for static checker warnings in the mlx4 Core and Eth drivers. Patch 1 fixes an actual bug discovered by the checker. Patches 2 and 3 fix the warnings without functional changes. Series generated against net-next commit: c49c777f qed: Delete redundant check on dcb_app priority ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tariq Toukan authored
In TX data-path, we intentionally do not byte-swap, as documented in code and in the cited commit log. This fixes sparse warning: en_tx.c:720:23: warning: incorrect type in argument 1 (different base types) en_tx.c:720:23: expected unsigned int [unsigned] [usertype] <noident> en_tx.c:720:23: got restricted __be32 [usertype] doorbell_qpn Fixes: 492f5add ("net/mlx4_en: Doorbell is byteswapped in Little Endian archs") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tariq Toukan authored
Fix the following SPARSE warning, in MLX4_GET() macro: drivers/net/ethernet/mellanox/mlx4/fw.c:233:9: warning: cast to restricted __be64 Fixes: 17d5ceb6 ("net/mlx4_core: Fix unaligned accesses") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tariq Toukan authored
Should take care of the endianness before assigning to params2 field. Fixes: 53f33ae2 ("net/mlx4_core: Port aggregation upper layer interface") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mika Westerberg authored
The 0day kbuild robot reports following crash: BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: tb_property_find+0xe/0x41 *pde = 00000000 Oops: 0000 [#1] CPU: 0 PID: 1 Comm: swapper Not tainted 4.14.0-rc1-00741-ge69b6c02 #412 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 task: 89c80000 task.stack: 89c7c000 EIP: tb_property_find+0xe/0x41 EFLAGS: 00210246 CPU: 0 EAX: 00000000 EBX: 7a368f47 ECX: 00000044 EDX: 7a368f47 ESI: 8851d340 EDI: 7a368f47 EBP: 89c7df0c ESP: 89c7defc DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 CR0: 80050033 CR2: 00000004 CR3: 027a2000 CR4: 00000690 Call Trace: tb_register_property_dir+0x49/0xb9 ? cdc_mbim_driver_init+0x1b/0x1b tbnet_init+0x77/0x9f ? cdc_mbim_driver_init+0x1b/0x1b do_one_initcall+0x7e/0x145 ? parse_args+0x10c/0x1b3 ? kernel_init_freeable+0xbe/0x159 kernel_init_freeable+0xd1/0x159 ? rest_init+0x110/0x110 kernel_init+0xd/0xd0 ret_from_fork+0x19/0x30 The reason is that both Thunderbolt bus and thunderbolt-net are build into the kernel image, and the latter is linked first because drivers/net comes before drivers/thunderbolt. Since both use module_init() thunderbolt-net ends up calling Thunderbolt bus functions too early triggering the above crash. Fix this by moving Thunderbolt bus initialization to happen earlier to make sure all the data structures are ready when Thunderbolt service drivers are initialized. To be on the safe side also add a check for properly initialized xdomain_property_dir to tb_register_property_dir(). Reported-by: kernel test robot <fengguang.wu@intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
per cpu allocations are already zeroed, no need to clear them again. Fixes: d52d3997 ("ipv6: Create percpu rt6_info") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Tejun Heo <tj@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
The commit bc044e8d ("udp: perform source validation for mcast early demux") does not take into account that broadcast packets lands in the same code path and they need different checks for the source address - notably, zero source address are valid for bcast and invalid for mcast. As a result, 2nd and later broadcast packets with 0 source address landing to the same socket are dropped. This breaks dhcp servers. Since we don't have stringent performance requirements for ingress broadcast traffic, fix it by disabling UDP early demux such traffic. Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Fixes: bc044e8d ("udp: perform source validation for mcast early demux") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jason A. Donenfeld authored
It turns out that multiple places can call netlink_dump(), which means it's still possible to dereference partially initialized values in dump() that were the result of a faulty returned start(). This fixes the issue by calling start() _before_ setting cb_running to true, so that there's no chance at all of hitting the dump() function through any indirect paths. It also moves the call to start() to be when the mutex is held. This has the nice side effect of serializing invocations to start(), which is likely desirable anyway. It also prevents any possible other races that might come out of this logic. In testing this with several different pieces of tricky code to trigger these issues, this commit fixes all avenues that I'm aware of. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Michal Kalderon says: ==================== qed: Add iWARP support for unaligned MPA packets This patch series adds support for handling unaligned MPA packets. (FPDUs split over more than one tcp packet). When FW detects a packet is unaligned it fowards the packet to the driver via a light l2 dedicated connection. The driver then stores this packet until the remainder of the packet is received. Once the driver reconstructs the full FPDU, it sends it down to fw via the ll2 connection. Driver also breaks down any packed PDUs into separate packets for FW. Patches 1-6 are all slight modifications to ll2 to support additional requirements for the unaligned MPA ll2 client. Patch 7 opens the additional ll2 connection for iWARP. Patches 8-12 contain the algorithm for aligning packets. ==================== Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Kalderon authored
We continue to maintain a maximum of three buffers per fpdu, to ensure that there are enough buffers for additional unaligned mpa packets. To support this, if a fpdu is split over more than two tcp packets, we use an intermediate buffer to copy the data to the previous buffer, then we can release the data. We need an intermediate buffer as the initial buffer partial packet could be located at the end of the packet, not leaving room for additional data. This is a corner case, and will usually not be the case. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Kalderon authored
There is a special case where an MPA header is split over to tcp packets, in this case we need to wait for the next packet to get the fpdu length. We use the incomplete_bytes to mark this fpdu as a "special" one which requires updating the length with the next packet Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Kalderon authored
When posting a packet on the ll2 tx, we can provide a cookie that will be returned upon tx completion. This cookie is the ll2 iwarp buffer which is then reposted to the rx ring. Part of the unaligned mpa flow is determining when a buffer can be reposted. Each buffer needs to be sent only once as a cookie for on the tx ring. In packed fpdu case, only the last packet will be sent with the buffer, meaning we need to handle the case that a cookie can be NULL on tx complete. In addition, when a fpdu splits over two buffers, but there are no more fpdus on the second buffer, two buffers need to be provided as a cookie. To avoid changing the ll2 interface to provide two cookies, we introduce a piggy buf pointer, relevant for iWARP only, that holds a pointer to a second buffer that needs to be released during tx completion. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-