- 01 Feb, 2019 6 commits
-
-
Alexei Starovoitov authored
Introduce BPF_F_LOCK flag for map_lookup and map_update syscall commands and for map_update() helper function. In all these cases take a lock of existing element (which was provided in BTF description) before copying (in or out) the rest of map value. Implementation details that are part of uapi: Array: The array map takes the element lock for lookup/update. Hash: hash map also takes the lock for lookup/update and tries to avoid the bucket lock. If old element exists it takes the element lock and updates the element in place. If element doesn't exist it allocates new one and inserts into hash table while holding the bucket lock. In rare case the hashmap has to take both the bucket lock and the element lock to update old value in place. Cgroup local storage: It is similar to array. update in place and lookup are done with lock taken. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
add bpf_spin_lock C based test that requires latest llvm with BTF support Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
add bpf_spin_lock tests to test_verifier.c that don't require latest llvm with BTF support Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
sync bpf.h Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
Allow 'struct bpf_spin_lock' to reside inside cgroup local storage. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
Introduce 'struct bpf_spin_lock' and bpf_spin_lock/unlock() helpers to let bpf program serialize access to other variables. Example: struct hash_elem { int cnt; struct bpf_spin_lock lock; }; struct hash_elem * val = bpf_map_lookup_elem(&hash_map, &key); if (val) { bpf_spin_lock(&val->lock); val->cnt++; bpf_spin_unlock(&val->lock); } Restrictions and safety checks: - bpf_spin_lock is only allowed inside HASH and ARRAY maps. - BTF description of the map is mandatory for safety analysis. - bpf program can take one bpf_spin_lock at a time, since two or more can cause dead locks. - only one 'struct bpf_spin_lock' is allowed per map element. It drastically simplifies implementation yet allows bpf program to use any number of bpf_spin_locks. - when bpf_spin_lock is taken the calls (either bpf2bpf or helpers) are not allowed. - bpf program must bpf_spin_unlock() before return. - bpf program can access 'struct bpf_spin_lock' only via bpf_spin_lock()/bpf_spin_unlock() helpers. - load/store into 'struct bpf_spin_lock lock;' field is not allowed. - to use bpf_spin_lock() helper the BTF description of map value must be a struct and have 'struct bpf_spin_lock anyname;' field at the top level. Nested lock inside another struct is not allowed. - syscall map_lookup doesn't copy bpf_spin_lock field to user space. - syscall map_update and program map_update do not update bpf_spin_lock field. - bpf_spin_lock cannot be on the stack or inside networking packet. bpf_spin_lock can only be inside HASH or ARRAY map value. - bpf_spin_lock is available to root only and to all program types. - bpf_spin_lock is not allowed in inner maps of map-in-map. - ld_abs is not allowed inside spin_lock-ed region. - tracing progs and socket filter progs cannot use bpf_spin_lock due to insufficient preemption checks Implementation details: - cgroup-bpf class of programs can nest with xdp/tc programs. Hence bpf_spin_lock is equivalent to spin_lock_irqsave. Other solutions to avoid nested bpf_spin_lock are possible. Like making sure that all networking progs run with softirq disabled. spin_lock_irqsave is the simplest and doesn't add overhead to the programs that don't use it. - arch_spinlock_t is used when its implemented as queued_spin_lock - archs can force their own arch_spinlock_t - on architectures where queued_spin_lock is not available and sizeof(arch_spinlock_t) != sizeof(__u32) trivial lock is used. - presence of bpf_spin_lock inside map value could have been indicated via extra flag during map_create, but specifying it via BTF is cleaner. It provides introspection for map key/value and reduces user mistakes. Next steps: - allow bpf_spin_lock in other map types (like cgroup local storage) - introduce BPF_F_LOCK flag for bpf_map_update() syscall and helper to request kernel to grab bpf_spin_lock before rewriting the value. That will serialize access to map elements. Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
- 31 Jan, 2019 9 commits
-
-
Valdis Kletnieks authored
Building with W=1 reveals some bitrot: CC kernel/bpf/cgroup.o kernel/bpf/cgroup.c:238: warning: Function parameter or member 'flags' not described in '__cgroup_bpf_attach' kernel/bpf/cgroup.c:367: warning: Function parameter or member 'unused_flags' not described in '__cgroup_bpf_detach' Add a kerneldoc line for 'flags'. Fixing the warning for 'unused_flags' is best approached by removing the unused parameter on the function call. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Valdis Kletnieks authored
Compiling with W=1 generates warnings: CC kernel/bpf/core.o kernel/bpf/core.c:721:12: warning: no previous prototype for ?bpf_jit_alloc_exec_limit? [-Wmissing-prototypes] 721 | u64 __weak bpf_jit_alloc_exec_limit(void) | ^~~~~~~~~~~~~~~~~~~~~~~~ kernel/bpf/core.c:757:14: warning: no previous prototype for ?bpf_jit_alloc_exec? [-Wmissing-prototypes] 757 | void *__weak bpf_jit_alloc_exec(unsigned long size) | ^~~~~~~~~~~~~~~~~~ kernel/bpf/core.c:762:13: warning: no previous prototype for ?bpf_jit_free_exec? [-Wmissing-prototypes] 762 | void __weak bpf_jit_free_exec(void *addr) | ^~~~~~~~~~~~~~~~~ All three are weak functions that archs can override, provide proper prototypes for when a new arch provides their own. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Valdis Kletnieks authored
Over the years, the function signature has changed, but the kerneldoc block hasn't. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Daniel Borkmann authored
Stanislav Fomichev says: ==================== If test_maps/test_verifier is running against the kernel which doesn't have _all_ BPF features enabled, it fails with an error. This patch series tries to probe kernel support for each failed test and skip it instead. This lets users run BPF selftests in the not-all-bpf-yes environments and received correct PASS/NON-PASS result. See https://www.spinics.net/lists/netdev/msg539331.html for more context. The series goes like this: * patch #1 skips sockmap tests in test_maps.c if BPF_MAP_TYPE_SOCKMAP map is not supported (if bpf_create_map fails, we probe the kernel for support) * patch #2 skips verifier tests if test->prog_type is not supported (if bpf_verify_program fails, we probe the kernel for support) * patch #3 skips verifier tests if test fixup map is not supported (if create_map fails, we probe the kernel for support) * next patches fix various small issues that arise from the first four: * patch #4 sets "unknown func bpf_trace_printk#6" prog_type to BPF_PROG_TYPE_TRACEPOINT so it is correctly skipped in CONFIG_BPF_EVENTS=n case * patch #5 exposes BPF_PROG_TYPE_CGROUP_{SKB,SOCK,SOCK_ADDR} only when CONFIG_CGROUP_BPF=y, this makes verifier correctly skip appropriate tests v3 changes: * rebased on top of Quentin's series which adds probes to libbpf v2 changes: * don't sprinkle "ifdef CONFIG_CGROUP_BPF" all around net/core/filter.c, doing it only in the bpf_types.h is enough to disable BPF_PROG_TYPE_CGROUP_{SKB,SOCK,SOCK_ADDR} prog types for non-cgroup enabled kernels ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Stanislav Fomichev authored
There is no way to exercise appropriate attach points without cgroups enabled. This lets test_verifier correctly skip tests for these prog_types if kernel was compiled without BPF cgroup support. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Stanislav Fomichev authored
We don't have this helper if the kernel was compiled without CONFIG_BPF_EVENTS. Setting prog_type to BPF_PROG_TYPE_TRACEPOINT let's verifier correctly skip this test based on the missing prog_type support in the kernel. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Stanislav Fomichev authored
Use recently introduced bpf_probe_map_type() to skip tests in the test_verifier if map creation (create_map) fails. It's handled explicitly for each fixup, i.e. if bpf_create_map returns negative fd, we probe the kernel for the appropriate map support and skip the test is map type is not supported. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Stanislav Fomichev authored
Use recently introduced bpf_probe_prog_type() to skip tests in the test_verifier() if bpf_verify_program() fails. The skipped test is indicated in the output. Example: ... 679/p bpf_get_stack return R0 within range SKIP (unsupported program type 5) 680/p ld_abs: invalid op 1 OK ... Summary: 863 PASSED, 165 SKIPPED, 3 FAILED Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Stanislav Fomichev authored
Use recently introduced bpf_probe_map_type() to skip test_sockmap() if map creation fails. The skipped test is indicated in the output. Example: test_sockmap SKIP (unsupported map type BPF_MAP_TYPE_SOCKMAP) Fork 1024 tasks to 'test_update_delete' ... test_sockmap SKIP (unsupported map type BPF_MAP_TYPE_SOCKMAP) Fork 1024 tasks to 'test_update_delete' ... test_maps: OK, 2 SKIPPED Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
- 30 Jan, 2019 25 commits
-
-
David S. Miller authored
Huazhong Tan says: ==================== code optimizations & bugfixes for HNS3 driver This patchset includes bugfixes and code optimizations for the HNS3 ethernet controller driver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jian Shen authored
In orginal codes, driver always enables flow director when intializing. When user disable flow director with command ethtool -K, the flow director will be enabled again after resetting. This patch fixes it by only enabling it when first initialzing. Fixes: 6871af29 ("net: hns3: Add reset handle for flow director") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jian Shen authored
When VF is resetting, it can't communicate to PF with mailbox msg. This patch adds reset state checking before sending keep alive msg to PF. Fixes: a6d818e3 ("net: hns3: Add vport alive state checking support") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Peng Li authored
HNS3 VF driver support NIC and Roce, hdev stores NIC handle and Roce handle, should use correct parameter for container_of. Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huazhong Tan authored
While hclge_init_umv_space() failed in the hclge_init_ae_dev(), we should undo all the operation which has been done successfully, the last success operation maybe hclge_mac_mdio_config(), so if hclge_init_umv_space() failed, we also need to undo it. Fixes: 288475b2ad01 ("{topost} net: hns3: refine umv space allocation") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jian Shen authored
The rss result is more uniform when use recommended hash key from microsoft, instead of the one generated by netdev_rss_key_fill(). Also using hash algorithm "xor" is better than "toeplitz". This patch modifies the default hash key and hash algorithm. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huazhong Tan authored
When the driver is unloading, if a global reset occurs, unmap_ring_from_vector() in the hns3_nic_uninit_vector_data() will fail, and hns3_nic_uninit_vector_data() just return. There may be some netif_napi_del() not be done. Since hardware will unmap all ring while resetting, so hns3_nic_uninit_vector_data() should ignore this error, and do the rest uninitialization. Fixes: 76ad4f0e ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huazhong Tan authored
When the driver is unloading, if there is a calling of ndo_open occurs between phy_disconnect() and unregister_netdev(), it will end up causing the kernel to eventually hit a NULL deref: [14942.417828] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048 [14942.529878] Mem abort info: [14942.551166] ESR = 0x96000006 [14942.567070] Exception class = DABT (current EL), IL = 32 bits [14942.623081] SET = 0, FnV = 0 [14942.639112] EA = 0, S1PTW = 0 [14942.643628] Data abort info: [14942.659227] ISV = 0, ISS = 0x00000006 [14942.674870] CM = 0, WnR = 0 [14942.679449] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000224ad6ad [14942.695595] [0000000000000048] pgd=00000021e6673003, pud=00000021dbf01003, pmd=0000000000000000 [14942.723163] Internal error: Oops: 96000006 [#1] PREEMPT SMP [14942.729358] Modules linked in: hns3(O) hclge(O) pv680_mii(O) hnae3(O) [last unloaded: hclge] [14942.738907] CPU: 1 PID: 26629 Comm: kworker/u4:13 Tainted: G O 4.18.0-rc1-12928-ga960791-dirty #145 [14942.749491] Hardware name: Huawei Technologies Co., Ltd. D05/D05, BIOS Hi1620 FPGA TB BOOT BIOS B763 08/17/2018 [14942.760392] Workqueue: events_power_efficient phy_state_machine [14942.766644] pstate: 80c00009 (Nzcv daif +PAN +UAO) [14942.771918] pc : test_and_set_bit+0x18/0x38 [14942.776589] lr : netif_carrier_off+0x24/0x70 [14942.781033] sp : ffff0000121abd20 [14942.784518] x29: ffff0000121abd20 x28: 0000000000000000 [14942.790208] x27: ffff0000164d3cd8 x26: ffff8021da68b7b8 [14942.795832] x25: 0000000000000000 x24: ffff8021eb407800 [14942.801445] x23: 0000000000000000 x22: 0000000000000000 [14942.807046] x21: 0000000000000001 x20: 0000000000000000 [14942.812672] x19: 0000000000000000 x18: ffff000009781708 [14942.818284] x17: 00000000004970e8 x16: ffff00000816ad48 [14942.823900] x15: 0000000000000000 x14: 0000000000000008 [14942.829528] x13: 0000000000000000 x12: 0000000000000f65 [14942.835149] x11: 0000000000000001 x10: 00000000000009d0 [14942.840753] x9 : ffff0000121abaa0 x8 : 0000000000000000 [14942.846360] x7 : ffff000009781708 x6 : 0000000000000003 [14942.851970] x5 : 0000000000000020 x4 : 0000000000000004 [14942.857575] x3 : 0000000000000002 x2 : 0000000000000001 [14942.863180] x1 : 0000000000000048 x0 : 0000000000000000 [14942.868875] Process kworker/u4:13 (pid: 26629, stack limit = 0x00000000c909dbf3) [14942.876464] Call trace: [14942.879200] test_and_set_bit+0x18/0x38 [14942.883376] phy_link_change+0x38/0x78 [14942.887378] phy_state_machine+0x3dc/0x4f8 [14942.891968] process_one_work+0x158/0x470 [14942.896223] worker_thread+0x50/0x470 [14942.900219] kthread+0x104/0x130 [14942.903905] ret_from_fork+0x10/0x1c [14942.907755] Code: d2800022 8b400c21 f9800031 9ac32044 (c85f7c22) [14942.914185] ---[ end trace 968c9e12eb740b23 ]--- So this patch fixes it by modifying the timing to do phy_connect_direct() and phy_disconnect(). Fixes: 256727da ("net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yunsheng Lin authored
When the VF shares the same TC config as PF, the business running on PF and VF must have samiliar module. For simplicity, we are not considering VF sharing the same tc configuration as PF use case, so this patch removes the support of TC configuration from VF and forcing VF to just use single TC. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huazhong Tan authored
hnae3_register_ae_dev() may fail, and it should return a error code to its caller, so change hnae3_register_ae_dev() return type to int. Also, when hnae3_register_ae_dev() return error, hns3_probe() should do some error handling and return the error code. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Peng Li authored
dev_close() stop the netdev and the service base on the netdev will stop. But ndev->netdev_ops->ndo_stop() may only stop HW and stack queue, the service base on the netdev can still work. Fixes: 5668abda ("net: hns3: add support for set_ringparam") Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jian Shen authored
In original codes, the .get_regs_len and .get_regs were missed assigned. This patch fixes it. Fixes: 1600c3e5 ("net: hns3: Support "ethtool -d" for HNS3 VF driver") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
liyongxin authored
Union l3_hdr_info and l4_hdr_info have already been defined in the hns3_enet.h, so it is unnecessary to define them elsewhere. This patch removes the redundant definition, and reuses the one defined in the hns3_enet.h. Signed-off-by: liyongxin <liyongxin1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Greg Ungerer says: ==================== net: dsa: mt7530: support MT7530 in the MT7621 SoC This is the fourth version of a patch series supporting the MT7530 switch as used in the MediaTek MT7621 SoC. Unlike the MediaTek MT7623 the MT7621 is built around a dual core MIPS CPU architecture. But inside it uses basically the same 7530 switch. This series resolves all issues I had with previous versions, and I can now reliably use the driver on a 7621 SoC platform. These patches were generated against linux-5.0-rc4. The first patch enables support for the existing kernel mediatek ethernet driver on the MT7621 SoC. This support is from Bjørn Mork, with an update and fix by me. Using this driver fixed a number of problems I had (TX checksums, large RX packet drop) over the staging driver (drivers/staging/mt7621-eth). Patch 2 modifies the mt7530 DSA driver to support the 7530 switch as implemented in the Mediatek MT7621 SoC. The last patch updates the devicetree bindings to reflect the new support in the mt7530 driver. There is no real dependencies between the patches, so they can be taken independantly. Creating a new binding for the MT7621 seems like the only viable approach to distinguish between a stand alone 7530 switch, the silicon module in the MT7623 SoC and the silicon in the MT7621. Certainly the 7530 ID register in the MT7623 and MT7621 returns the same value, "0x7530001". Looking at the mt7530.c DSA driver it might make some sense to convert the existing "mediatek,mcm" binding to something like "mediatek,mt7623" to be consistent with this new MT7621 support. As far as I can tell this is the intention of this binding. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Greg Ungerer authored
Add devicetree binding to support the compatible mt7530 switch as used in the MediaTek MT7621 SoC. Signed-off-by: Greg Ungerer <gerg@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Sean Wang <sean.wang@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Greg Ungerer authored
The MediaTek MT7621 SoC device contains a 7530 switch, and the existing linux kernel 7530 DSA switch driver can be used with it. The bulk of the changes required stem from the 7621 having different regulator and pad setup. The existing setup of these in the 7530 driver appears to be very specific to its implemtation in the Mediatek 7623 SoC. (Not entirely surprising given the 7623 is a quad core ARM based SoC, and the 7621 is a dual core, dual thread MIPS based SoC). Create a new devicetree type, "mediatek,mt7621", to support the 7530 switch in the 7621 SoC. There appears to be no usable ID register to distinguish it from a 7530 in other hardware at runtime. This is used to carry out the appropriate configuration and setup. Signed-off-by: Greg Ungerer <gerg@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Sean Wang <sean.wang@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bjørn Mork authored
The Mediatek MT7621 SoC contains the same ethernet hardware module as used on a number of other MediaTek SoC parts. There are some minor differences to deal with but we can use the same driver to support them all. This patch is based on work by Bjørn Mork <bjorn@mork.no>, and his original patch is at: https://github.com/bmork/LEDE/commit/3293bc63f5461ca1eb0bbc4ed90145335e7e3404 There is an additional compatible devicetree type added, and the primary change to the code required is to support a single interrupt (for both RX and TX interrupts). Signed-off-by: Bjørn Mork <bjorn@mork.no> [gerg@kernel.org: rebase to mainline and irq handler fix] Signed-off-by: Greg Ungerer <gerg@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Sean Wang <sean.wang@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ido Schimmel says: ==================== mlxsw: spectrum_acl: Include delta bits into hashtable key The Spectrum-2 ASIC allows multiple rules to use the same mask provided that the difference between their masks is small enough (up to 8 consecutive delta bits). A more detailed explanation is provided in merge commit 756cd366 ("Merge branch 'mlxsw-Introduce-algorithmic-TCAM-support'"). These delta bits are part of the rule's key and therefore rules that only differ in their delta bits can be inserted with the same A-TCAM mask. In case two rules share the same key and only differ in their priority, then the second will spill to the C-TCAM. Current code does not take the delta bits into account when checking for duplicate rules, which leads to unnecessary spillage to the C-TCAM. This may result in reduced scale and performance. Patch #1 includes the delta bits in the rule's key to avoid the above mentioned problem. Patch #2 adds a tracepoint when a rule is inserted into the C-TCAM. Patches #3-#5 add test cases to make sure unnecessary spillage into the C-TCAM does not occur. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Ensure that the bug is fixed and we no longer have C-TCAM spill for two keys that differ only in delta. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
With recent fix in C-TCAM spillage for delta masks, the test stops to be falsely positive. So fix it not to use delta by adding src_ip bits to the masks. Alongside with that, use C-TCAM spill trace to see when the spillage actually happens. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Allow to specify number of trace hits and move helpers to the beginning of the file. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Add some visibility to the rule addition process and trace whenever rule spilled into C-TCAM. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Currently only ERP mask masked bits in key are considered for the hashtable key. That leads to false negative collisions and fallbacks to C-TCAM in case two keys differ only in delta bits. Fix this by taking full encoded key as a hashtable key, including delta bits. Reported-by: Nir Dotan <nird@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Xin Long says: ==================== sctp: support SCTP_FUTURE/CURRENT/ALL_ASSOC This patchset adds the support for 3 assoc_id constants: SCTP_FUTURE_ASSOC SCTP_CURRENT_ASSOC, SCTP_ALL_ASSOC, described in rfc6458#section-7.2: All socket options set on a one-to-one style listening socket also apply to all future accepted sockets. For one-to-many style sockets, often a socket option will pass a structure that includes an assoc_id field. This field can be filled with the association identifier of a particular association and unless otherwise specified can be filled with one of the following constants: SCTP_FUTURE_ASSOC: Specifies that only future associations created after this socket option will be affected by this call. SCTP_CURRENT_ASSOC: Specifies that only currently existing associations will be affected by this call, and future associations will still receive the previous default value. SCTP_ALL_ASSOC: Specifies that all current and future associations will be affected by this call. The functions for many other sockopts that use assoc_id also need to be updated accordingly. ==================== Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_scheduler and check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_scheduler, it's compatible with 0. SCTP_CURRENT_ASSOC is supported for SCTP_STREAM_SCHEDULER in this patch. It also adds default_ss in sctp_sock to support SCTP_FUTURE_ASSOC. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-