- 11 Nov, 2019 10 commits
-
-
Magnus Karlsson authored
The libbpf AF_XDP code is extended to allow for the creation of Rx only or Tx only sockets. Previously it returned an error if the socket was not initialized for both Rx and Tx. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1573148860-30254-4-git-send-email-magnus.karlsson@intel.com
-
Magnus Karlsson authored
Add support for the XDP_SHARED_UMEM mode to the xdpsock sample application. As libbpf does not have a built in XDP program for this mode, we use an explicitly loaded XDP program. This also serves as an example on how to write your own XDP program that can route to an AF_XDP socket. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1573148860-30254-3-git-send-email-magnus.karlsson@intel.com
-
Magnus Karlsson authored
Add support in libbpf to create multiple sockets that share a single umem. Note that an external XDP program need to be supplied that routes the incoming traffic to the desired sockets. So you need to supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load your own XDP program. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1573148860-30254-2-git-send-email-magnus.karlsson@intel.com
-
Alexei Starovoitov authored
Toke Høiland-Jørgensen says: ==================== This series fixes a few bugs in libbpf that I discovered while playing around with the new auto-pinning code, and writing the first utility in xdp-tools[0]: - If object loading fails, libbpf does not clean up the pinnings created by the auto-pinning mechanism. - EPERM is not propagated to the caller on program load - Netlink functions write error messages directly to stderr In addition, libbpf currently only has a somewhat limited getter function for XDP link info, which makes it impossible to discover whether an attached program is in SKB mode or not. So the last patch in the series adds a new getter for XDP link info which returns all the information returned via netlink (and which can be extended later). Finally, add a getter for BPF program size, which can be used by the caller to estimate the amount of locked memory needed to load a program. A selftest is added for the pinning change, while the other features were tested in the xdp-filter tool from the xdp-tools repo. The 'new-libbpf-features' branch contains the commits that make use of the new XDP getter and the corrected EPERM error code. [0] https://github.com/xdp-project/xdp-tools Changelog: v4: - Don't do any size checks on struct xdp_info, just copy (and/or zero) whatever size the caller supplied. v3: - Pass through all kernel error codes on program load (instead of just EPERM). - No new bpf_object__unload() variant, just do the loop at the caller - Don't reject struct xdp_info sizes that are bigger than what we expect. - Add a comment noting that bpf_program__size() returns the size in bytes v2: - Keep function names in libbpf.map sorted properly ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Toke Høiland-Jørgensen authored
This adds a new getter for the BPF program size (in bytes). This is useful for a caller that is trying to predict how much memory will be locked by loading a BPF object into the kernel. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333185272.88376.10996937115395724683.stgit@toke.dk
-
Toke Høiland-Jørgensen authored
Currently, libbpf only provides a function to get a single ID for the XDP program attached to the interface. However, it can be useful to get the full set of program IDs attached, along with the attachment mode, in one go. Add a new getter function to support this, using an extendible structure to carry the information. Express the old bpf_get_link_id() function in terms of the new function. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157333185164.88376.7520653040667637246.stgit@toke.dk
-
Toke Høiland-Jørgensen authored
The netlink functions were using fprintf(stderr, ) directly to print out error messages, instead of going through the usual logging macros. This makes it impossible for the calling application to silence or redirect those error messages. Fix this by switching to pr_warn() in nlattr.c and netlink.c. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333185055.88376.15999360127117901443.stgit@toke.dk
-
Toke Høiland-Jørgensen authored
When loading an eBPF program, libbpf overrides the return code for EPERM errors instead of returning it to the caller. This makes it hard to figure out what went wrong on load. In particular, EPERM is returned when the system rlimit is too low to lock the memory required for the BPF program. Previously, this was somewhat obscured because the rlimit error would be hit on map creation (which does return it correctly). However, since maps can now be reused, object load can proceed all the way to loading programs without hitting the error; propagating it even in this case makes it possible for the caller to react appropriately (and, e.g., attempt to raise the rlimit before retrying). Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333184946.88376.11768171652794234561.stgit@toke.dk
-
Toke Høiland-Jørgensen authored
This add tests for the different variations of automatic map unpinning on load failure. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333184838.88376.8243704248624814775.stgit@toke.dk
-
Toke Høiland-Jørgensen authored
Since the automatic map-pinning happens during load, it will leave pinned maps around if the load fails at a later stage. Fix this by unpinning any pinned maps on cleanup. To avoid unpinning pinned maps that were reused rather than newly pinned, add a new boolean property on struct bpf_map to keep track of whether that map was reused or not; and only unpin those maps that were not reused. Fixes: 57a00f41 ("libbpf: Add auto-pinning of maps when loading BPF objects") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333184731.88376.9992935027056165873.stgit@toke.dk
-
- 09 Nov, 2019 2 commits
-
-
Daniel T. Lee authored
Since, the new syntax of BTF-defined map has been introduced, the syntax for using maps under samples directory are mixed up. For example, some are already using the new syntax, and some are using existing syntax by calling them as 'legacy'. As stated at commit abd29c93 ("libbpf: allow specifying map definitions using BTF"), the BTF-defined map has more compatablility with extending supported map definition features. The commit doesn't replace all of the map to new BTF-defined map, because some of the samples still use bpf_load instead of libbpf, which can't properly create BTF-defined map. This will only updates the samples which uses libbpf API for loading bpf program. (ex. bpf_prog_load_xattr) Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel T. Lee authored
Currently, under samples, several methods are being used to load bpf program. Since using libbpf is preferred solution, lots of previously used 'load_bpf_file' from bpf_load are replaced with 'bpf_prog_load_xattr' from libbpf. But some of the error messages still show up as 'load_bpf_file' instead of 'bpf_prog_load_xattr'. This commit fixes outdated errror messages under samples and fixes some code style issues. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191107005153.31541-2-danieltimlee@gmail.com
-
- 07 Nov, 2019 14 commits
-
-
Martin KaFai Lau authored
Access the skb->cb[] in the kfree_skb test. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191107180905.4097871-1-kafai@fb.com
-
Martin KaFai Lau authored
This patch adds array support to btf_struct_access(). It supports array of int, array of struct and multidimensional array. It also allows using u8[] as a scratch space. For example, it allows access the "char cb[48]" with size larger than the array's element "char". Another potential use case is "u64 icsk_ca_priv[]" in the tcp congestion control. btf_resolve_size() is added to resolve the size of any type. It will follow the modifier if there is any. Please see the function comment for details. This patch also adds the "off < moff" check at the beginning of the for loop. It is to reject cases when "off" is pointing to a "hole" in a struct. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191107180903.4097702-1-kafai@fb.com
-
Daniel Borkmann authored
Andrii Nakryiko says: ==================== Github's mirror of libbpf got LGTM and Coverity statis analysis running against it and spotted few real bugs and few potential issues. This patch series fixes found issues. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Andrii Nakryiko authored
If we get ELF file with "maps" section, but no symbols pointing to it, we'll end up with division by zero. Add check against this situation and exit early with error. Found by Coverity scan against Github libbpf sources. Fixes: bf829271 ("libbpf: refactor map initialization") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-6-andriin@fb.com
-
Andrii Nakryiko authored
Perform size check always in btf__resolve_size. Makes the logic a bit more robust against corrupted BTF and silences LGTM/Coverity complaining about always true (size < 0) check. Fixes: 69eaab04 ("btf: extract BTF type size calculation") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-5-andriin@fb.com
-
Andrii Nakryiko authored
Fix few issues found by Coverity and LGTM. Fixes: b053b439 ("bpf: libbpf: bpftool: Print bpf_line_info during prog dump") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-4-andriin@fb.com
-
Andrii Nakryiko authored
Fix a potential overflow issue found by LGTM analysis, based on Github libbpf source code. Fixes: 3d650141 ("bpf: libbpf: Add btf_line_info support to libbpf") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-3-andriin@fb.com
-
Andrii Nakryiko authored
Coverity scan against Github libbpf code found the issue of not freeing memory and leaving already freed memory still referenced from bpf_program. Fix it by re-assigning successfully reallocated memory sooner. Fixes: 2993e051 ("tools/bpf: add support to read .BTF.ext sections") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-2-andriin@fb.com
-
Andrii Nakryiko authored
Fix issue reported by static analysis (Coverity). If bpf_prog_get_fd_by_id() fails, xsk_lookup_bpf_maps() will fail as well and clean-up code will attempt close() with fd=-1. Fix by checking bpf_prog_get_fd_by_id() return result and exiting early. Fixes: 10a13bb4 ("libbpf: remove qidconf and better support external bpf programs.") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107054059.313884-1-andriin@fb.com
-
Ilya Leoshkevich authored
We don't need them since commit e1cf4bef ("bpf, s390x: remove ld_abs/ld_ind") and commit a3212b8f ("bpf, s390x: remove obsolete exception handling from div/mod"). Also, use BIT(n) instead of 1 << n, because checkpatch says so. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107114033.90505-1-iii@linux.ibm.com
-
Ilya Leoshkevich authored
This change does not alter JIT behavior; it only makes it possible to safely invoke JIT macros with complex arguments in the future. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107113211.90105-1-iii@linux.ibm.com
-
Ilya Leoshkevich authored
A BPF program may consist of 1m instructions, which means JIT instruction-address mapping can be as large as 4m. s390 has FORCE_MAX_ZONEORDER=9 (for memory hotplug reasons), which means maximum kmalloc size is 1m. This makes it impossible to JIT programs with more than 256k instructions. Fix by using kvcalloc, which falls back to vmalloc for larger allocations. An alternative would be to use a radix tree, but that is not supported by bpf_prog_fill_jited_linfo. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107141838.92202-1-iii@linux.ibm.com
-
Ilya Leoshkevich authored
When compiling larger programs with bpf_asm, it's possible to accidentally exceed jt/jf range, in which case it won't complain, but rather silently emit a truncated offset, leading to a "happy debugging" situation. Add a warning to help detecting such issues. It could be made an error instead, but this might break compilation of existing code (which might be working by accident). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107100349.88976-1-iii@linux.ibm.com
-
Martin KaFai Lau authored
In the bpf interpreter mode, bpf_probe_read_kernel is used to read from PTR_TO_BTF_ID's kernel object. It currently missed considering the insn->off. This patch fixes it. Fixes: 2a02759e ("bpf: Add support for BTF pointers to interpreter") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191107014640.384083-1-kafai@fb.com
-
- 06 Nov, 2019 2 commits
-
-
Andrii Nakryiko authored
Streamline BPF_CORE_READ_BITFIELD_PROBED interface to follow BPF_CORE_READ_BITFIELD (direct) and BPF_CORE_READ, in general, i.e., just return read result or 0, if underlying bpf_probe_read() failed. In practice, real applications rarely check bpf_probe_read() result, because it has to always work or otherwise it's a bug. So propagating internal bpf_probe_read() error from this macro hurts usability without providing real benefits in practice. This patch fixes the issue and simplifies usage, noticeable even in selftest itself. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191106201500.2582438-1-andriin@fb.com
-
Andrii Nakryiko authored
As part of 42765ede ("selftests/bpf: Remove too strict field offset relo test cases"), few ints relocations negative (supposed to fail) tests were removed, but not completely. Due to them being negative, some leftovers in prog_tests/core_reloc.c went unnoticed. Clean them up. Fixes: 42765ede ("selftests/bpf: Remove too strict field offset relo test cases") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191106173659.1978131-1-andriin@fb.com
-
- 04 Nov, 2019 12 commits
-
-
Daniel Borkmann authored
Andrii Nakryiko says: ==================== This patch set adds support for reading bitfields in a relocatable manner through a set of relocations emitted by Clang, corresponding libbpf support for those relocations, as well as abstracting details into BPF_CORE_READ_BITFIELD/BPF_CORE_READ_BITFIELD_PROBED macro. We also add support for capturing relocatable field size, so that BPF program code can adjust its logic to actual amount of data it needs to operate on, even if it changes between kernels. New convenience macro is added to bpf_core_read.h (bpf_core_field_size(), in the same family of macro as bpf_core_read() and bpf_core_field_exists()). Corresponding set of selftests are added to excercise this logic and validate correctness in a variety of scenarios. Some of the overly strict logic of matching fields is relaxed to support wider variety of scenarios. See patch #1 for that. Patch #1 removes few overly strict test cases. Patch #2 adds support for bitfield-related relocations. Patch #3 adds some further adjustments to support generic field size relocations and introduces bpf_core_field_size() macro. Patch #4 tests bitfield reading. Patch #5 tests field size relocations. v1 -> v2: - added direct memory read-based macro and tests for bitfield reads. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Andrii Nakryiko authored
Add test verifying correctness and logic of field size relocation support in libbpf. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191101222810.1246166-6-andriin@fb.com
-
Andrii Nakryiko authored
Add a bunch of selftests verifying correctness of relocatable bitfield reading support in libbpf. Both bpf_probe_read()-based and direct read-based bitfield macros are tested. core_reloc.c "test_harness" is extended to support raw tracepoint and new typed raw tracepoints as test BPF program types. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191101222810.1246166-5-andriin@fb.com
-
Andrii Nakryiko authored
Add bpf_core_field_size() macro, capturing a relocation against field size. Adjust bits of internal libbpf relocation logic to allow capturing size relocations of various field types: arrays, structs/unions, enums, etc. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191101222810.1246166-4-andriin@fb.com
-
Andrii Nakryiko authored
Add support for the new field relocation kinds, necessary to support relocatable bitfield reads. Provide macro for abstracting necessary code doing full relocatable bitfield extraction into u64 value. Two separate macros are provided: - BPF_CORE_READ_BITFIELD macro for direct memory read-enabled BPF programs (e.g., typed raw tracepoints). It uses direct memory dereference to extract bitfield backing integer value. - BPF_CORE_READ_BITFIELD_PROBED macro for cases where bpf_probe_read() needs to be used to extract same backing integer value. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191101222810.1246166-3-andriin@fb.com
-
Andrii Nakryiko authored
As libbpf is going to gain support for more field relocations, including field size, some restrictions about exact size match are going to be lifted. Remove test cases that explicitly test such failures. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191101222810.1246166-2-andriin@fb.com
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== mlx5-updates-2019-11-01 Misc updates for mlx5 netdev and core driver 1) Steering Core: Replace CRC32 internal implementation with standard kernel lib. 2) Steering Core: Support IPv4 and IPv6 mixed matcher. 3) Steering Core: Lockless FTE read lookups 4) TC: Bit sized fields rewrite support. 5) FPGA: Standalone FPGA support. 6) SRIOV: Reset VF parameters configurations on SRIOV disable. 7) netdev: Dump WQs wqe descriptors on CQE with error events. 8) MISC Cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
YueHaibing authored
drivers/isdn/hardware/mISDN/mISDNisar.c:30:17: warning: faxmodulation_s defined but not used [-Wunused-const-variable=] It is never used, so can be removed. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vincent Cheng authored
The IDT ClockMatrix (TM) family includes integrated devices that provide eight PLL channels. Each PLL channel can be independently configured as a frequency synthesizer, jitter attenuator, digitally controlled oscillator (DCO), or a digital phase lock loop (DPLL). Typically these devices are used as timing references and clock sources for PTP applications. This patch adds support for the device. Co-developed-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vincent Cheng authored
Add device tree binding doc for the IDT ClockMatrix PTP clock. Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Francesco Ruggeri authored
traceroute6 output can be confusing, in that it shows the address that a router would use to reach the sender, rather than the address the packet used to reach the router. Consider this case: ------------------------ N2 | | ------ ------ N3 ---- | R1 | | R2 |------|H2| ------ ------ ---- | | ------------------------ N1 | ---- |H1| ---- where H1's default route is through R1, and R1's default route is through R2 over N2. traceroute6 from H1 to H2 shows R2's address on N1 rather than on N2. The script below can be used to reproduce this scenario. traceroute6 output without this patch: traceroute to 2000:103::4 (2000:103::4), 30 hops max, 80 byte packets 1 2000:101::1 (2000:101::1) 0.036 ms 0.008 ms 0.006 ms 2 2000:101::2 (2000:101::2) 0.011 ms 0.008 ms 0.007 ms 3 2000:103::4 (2000:103::4) 0.013 ms 0.010 ms 0.009 ms traceroute6 output with this patch: traceroute to 2000:103::4 (2000:103::4), 30 hops max, 80 byte packets 1 2000:101::1 (2000:101::1) 0.056 ms 0.019 ms 0.006 ms 2 2000:102::2 (2000:102::2) 0.013 ms 0.008 ms 0.008 ms 3 2000:103::4 (2000:103::4) 0.013 ms 0.009 ms 0.009 ms #!/bin/bash # # ------------------------ N2 # | | # ------ ------ N3 ---- # | R1 | | R2 |------|H2| # ------ ------ ---- # | | # ------------------------ N1 # | # ---- # |H1| # ---- # # N1: 2000:101::/64 # N2: 2000:102::/64 # N3: 2000:103::/64 # # R1's host part of address: 1 # R2's host part of address: 2 # H1's host part of address: 3 # H2's host part of address: 4 # # For example: # the IPv6 address of R1's interface on N2 is 2000:102::1/64 # # Nets are implemented by macvlan interfaces (bridge mode) over # dummy interfaces. # # Create net namespaces ip netns add host1 ip netns add host2 ip netns add rtr1 ip netns add rtr2 # Create nets ip link add net1 type dummy; ip link set net1 up ip link add net2 type dummy; ip link set net2 up ip link add net3 type dummy; ip link set net3 up # Add interfaces to net1, move them to their nemaspaces ip link add link net1 dev host1net1 type macvlan mode bridge ip link set host1net1 netns host1 ip link add link net1 dev rtr1net1 type macvlan mode bridge ip link set rtr1net1 netns rtr1 ip link add link net1 dev rtr2net1 type macvlan mode bridge ip link set rtr2net1 netns rtr2 # Add interfaces to net2, move them to their nemaspaces ip link add link net2 dev rtr1net2 type macvlan mode bridge ip link set rtr1net2 netns rtr1 ip link add link net2 dev rtr2net2 type macvlan mode bridge ip link set rtr2net2 netns rtr2 # Add interfaces to net3, move them to their nemaspaces ip link add link net3 dev rtr2net3 type macvlan mode bridge ip link set rtr2net3 netns rtr2 ip link add link net3 dev host2net3 type macvlan mode bridge ip link set host2net3 netns host2 # Configure interfaces and routes in host1 ip netns exec host1 ip link set lo up ip netns exec host1 ip link set host1net1 up ip netns exec host1 ip -6 addr add 2000:101::3/64 dev host1net1 ip netns exec host1 ip -6 route add default via 2000:101::1 # Configure interfaces and routes in rtr1 ip netns exec rtr1 ip link set lo up ip netns exec rtr1 ip link set rtr1net1 up ip netns exec rtr1 ip -6 addr add 2000:101::1/64 dev rtr1net1 ip netns exec rtr1 ip link set rtr1net2 up ip netns exec rtr1 ip -6 addr add 2000:102::1/64 dev rtr1net2 ip netns exec rtr1 ip -6 route add default via 2000:102::2 ip netns exec rtr1 sysctl net.ipv6.conf.all.forwarding=1 # Configure interfaces and routes in rtr2 ip netns exec rtr2 ip link set lo up ip netns exec rtr2 ip link set rtr2net1 up ip netns exec rtr2 ip -6 addr add 2000:101::2/64 dev rtr2net1 ip netns exec rtr2 ip link set rtr2net2 up ip netns exec rtr2 ip -6 addr add 2000:102::2/64 dev rtr2net2 ip netns exec rtr2 ip link set rtr2net3 up ip netns exec rtr2 ip -6 addr add 2000:103::2/64 dev rtr2net3 ip netns exec rtr2 sysctl net.ipv6.conf.all.forwarding=1 # Configure interfaces and routes in host2 ip netns exec host2 ip link set lo up ip netns exec host2 ip link set host2net3 up ip netns exec host2 ip -6 addr add 2000:103::4/64 dev host2net3 ip netns exec host2 ip -6 route add default via 2000:103::2 # Ping host2 from host1 ip netns exec host1 ping6 -c5 2000:103::4 # Traceroute host2 from host1 ip netns exec host1 traceroute6 2000:103::4 # Delete nets ip link del net3 ip link del net2 ip link del net1 # Delete namespaces ip netns del rtr2 ip netns del rtr1 ip netns del host2 ip netns del host1 Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Original-patch-by: Honggang Xu <hxu@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tuong Lien authored
As mentioned in commit e95584a8 ("tipc: fix unlimited bundling of small messages"), the current message bundling algorithm is inefficient that can generate bundles of only one payload message, that causes unnecessary overheads for both the sender and receiver. This commit re-designs the 'tipc_msg_make_bundle()' function (now named as 'tipc_msg_try_bundle()'), so that when a message comes at the first place, we will just check & keep a reference to it if the message is suitable for bundling. The message buffer will be put into the link backlog queue and processed as normal. Later on, when another one comes we will make a bundle with the first message if possible and so on... This way, a bundle if really needed will always consist of at least two payload messages. Otherwise, we let the first buffer go its way without any need of bundling, so reduce the overheads to zero. Moreover, since now we have both the messages in hand, we can even optimize the 'tipc_msg_bundle()' function, make bundle of a very large (size ~ MSS) and small messages which is not with the current algorithm e.g. [1400-byte message] + [10-byte message] (MTU = 1500). Acked-by: Ying Xue <ying.xue@windreiver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-