- 02 Apr, 2018 20 commits
-
-
Ilya Dryomov authored
All object requests are associated with an image request now -- avoid duplicating the same info in each object request. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
There are no standalone (!IMG_DATA) object requests anymore. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
Do away with partial request completions and all the associated complexity. Individual object requests no longer need to be completed in order -- when the last one becomes ready, we complete the entire higher level request all at once. This also wraps up the conversion to a state machine model and eliminates the recursion described in commit 6d69bb53 ("rbd: prevent kernel stack blow up on rbd map"). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
It should be void now. Also, object requests are unlinked only in image request destructor, which can't run before rbd_img_request_put(), so no need for _safe. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
Store op_type in its own field instead of packing it into flags. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
No need to pass rbd_dev and op_type to rbd_osd_req_create(): there are no standalone (!IMG_DATA) object requests anymore and osd_req->r_flags can be set in rbd_osd_req_format_{read,write}(). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
The notable changes are: - instead of explicitly stat'ing the object to see if it exists before issuing the write, send the write optimistically along with the stat in a single OSD request - zero copyup optimization - all object requests are associated with an image request and have a valid ->img_request pointer; there are no standalone (!IMG_DATA) object requests anymore - code is structured as a state machine (vs a bunch of callbacks with implicit state) Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
rbd needs this for null copyups -- if copyup data is all zeroes, we want to save some I/O and network bandwidth. See rbd_obj_issue_copyup() in the next commit. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Ilya Dryomov authored
In preparation for rbd "fancy" striping which requires bio_vec arrays, wire up BVECS data type and kill off PAGES data type. There is nothing wrong with using page vectors for copyup requests -- it's just less iterator boilerplate code to write for the new striping framework. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Ilya Dryomov authored
In preparation for rbd "fancy" striping, introduce ceph_bvec_iter for working with bio_vec array data buffers. The wrappers are trivial, but make it look similar to ceph_bio_iter. Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
The initiating object request is the proper owner -- save a bit of space. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Ilya Dryomov authored
obj_req->pages is for provided data buffers. stat requests are internal and should be NODATA. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Ilya Dryomov authored
Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Ilya Dryomov authored
The reason we clone bios is to be able to give each object request (and consequently each ceph_osd_data/ceph_msg_data item) its own pointer to a (list of) bio(s). The messenger then initializes its cursor with cloned bio's ->bi_iter, so it knows where to start reading from/writing to. That's all the cloned bios are used for: to determine each object request's starting position in the provided data buffer. Introduce ceph_bio_iter to do exactly that -- store position within bio list (i.e. pointer to bio) + position within that bio (i.e. bvec_iter). Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
- make it void - xlen (object extent length) out parameter should be u32 because only a single stripe unit is mapped at a time Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Ilya Dryomov authored
bl, stripeno and objsetno should be u64 -- otherwise large enough files get corrupted. How large depends on file layout: - 4M-objects layout (default): any file over 16P - 64K-objects layout (smallest possible object size): any file over 512T Only CephFS is affected, rbd doesn't use ceph_calc_file_object_mapping() yet. Fortunately, CephFS has a max_file_size configurable, the default for which is way below both of the above numbers. Reimplement the logic from scratch with no layout validation -- it's done on the MDS side. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
Ilya Dryomov authored
Commit 21acdf45 ("rbd: set max_segments to USHRT_MAX") removed the limit on max_segments. Remove the limit on max_segment_size as well. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Alex Elder <elder@linaro.org>
-
- 01 Apr, 2018 1 commit
-
-
Linus Torvalds authored
-
- 31 Mar, 2018 5 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Ingo Molnar: "Two fixlets" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/hwbp: Simplify the perf-hwbp code, fix documentation perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Ingo Molnar: "Two UV platform fixes, and a kbuild fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/UV: Fix critical UV MMR address error x86/platform/uv/BAU: Add APIC idt entry x86/purgatory: Avoid creating stray .<pid>.d files, remove -MD from KBUILD_CFLAGS
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 PTI fixes from Ingo Molnar: "Two fixes: a relatively simple objtool fix that makes Clang built kernels work with ORC debug info, plus an alternatives macro fix" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternatives: Fixup alternative_call_2 objtool: Add Clang support
-
Linus Torvalds authored
Merge tag 'kbuild-fixes-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - fix missed rebuild of TRIM_UNUSED_KSYMS - fix rpm-pkg for GNU tar >= 1.29 - include scripts/dtc/include-prefixes/* to kernel header deb-pkg - add -no-integrated-as option ealier to fix building with Clang - fix netfilter Makefile for parallel building * tag 'kbuild-fixes-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: netfilter: nf_nat_snmp_basic: add correct dependency to Makefile kbuild: rpm-pkg: Support GNU tar >= 1.29 builddeb: Fix header package regarding dtc source links kbuild: set no-integrated-as before incl. arch Makefile kbuild: make scripts/adjust_autoksyms.sh robust against timestamp races
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Fix RCU locking in xfrm_local_error(), from Taehee Yoo. 2) Fix return value assignments and thus error checking in iwl_mvm_start_ap_ibss(), from Johannes Berg. 3) Don't count header length twice in vti4, from Stefano Brivio. 4) Fix deadlock in rt6_age_examine_exception, from Eric Dumazet. 5) Fix out-of-bounds access in nf_sk_lookup_slow{v4,v6}() from Subash Abhinov. 6) Check nladdr size in netlink_connect(), from Alexander Potapenko. 7) VF representor SQ numbers are 32 not 16 bits, in mlx5 driver, from Or Gerlitz. 8) Out of bounds read in skb_network_protocol(), from Eric Dumazet. 9) r8169 driver sets driver data pointer after register_netdev() which is too late. Fix from Heiner Kallweit. 10) Fix memory leak in mlx4 driver, from Moshe Shemesh. 11) The multi-VLAN decap fix added a regression when dealing with device that lack a MAC header, such as tun. Fix from Toshiaki Makita. 12) Fix integer overflow in dynamic interrupt coalescing code. From Tal Gilboa. 13) Use after free in vrf code, from David Ahern. 14) IPV6 route leak between VRFs fix, also from David Ahern. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits) net: mvneta: fix enable of all initialized RXQs net/ipv6: Fix route leaking between VRFs vrf: Fix use after free and double free in vrf_finish_output ipv6: sr: fix seg6 encap performances with TSO enabled net/dim: Fix int overflow vlan: Fix vlan insertion for packets without ethernet header net: Fix untag for vlan packets without ethernet header atm: iphase: fix spelling mistake: "Receiverd" -> "Received" vhost: validate log when IOTLB is enabled qede: Do not drop rx-checksum invalidated packets. hv_netvsc: enable multicast if necessary ip_tunnel: Resolve ipsec merge conflict properly. lan78xx: Crash in lan78xx_writ_reg (Workqueue: events lan78xx_deferred_multicast_write) qede: Fix barrier usage after tx doorbell write. vhost: correctly remove wait queue during poll failure net/mlx4_core: Fix memory leak while delete slave's resources net/mlx4_en: Fix mixed PFC and Global pause user control requests net/smc: use announced length in sock_recvmsg() llc: properly handle dev_queue_xmit() return value strparser: Fix sign of err codes ...
-
- 30 Mar, 2018 14 commits
-
-
Yelena Krivosheev authored
In mvneta_port_up() we enable relevant RX and TX port queues by write queues bit map to an appropriate register. q_map must be ZERO in the beginning of this process. Signed-off-by: Yelena Krivosheev <yelena@marvell.com> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Donald reported that IPv6 route leaking between VRFs is not working. The root cause is the strict argument in the call to rt6_lookup when validating the nexthop spec. ip6_route_check_nh validates the gateway and device (if given) of a route spec. It in turn could call rt6_lookup (e.g., lookup in a given table did not succeed so it falls back to a full lookup) and if so sets the strict argument to 1. That means if the egress device is given, the route lookup needs to return a result with the same device. This strict requirement does not work with VRFs (IPv4 or IPv6) because the oif in the flow struct is overridden with the index of the VRF device to trigger a match on the l3mdev rule and force the lookup to its table. The right long term solution is to add an l3mdev index to the flow struct such that the oif is not overridden. That solution will not backport well, so this patch aims for a simpler solution to relax the strict argument if the route spec device is an l3mdev slave. As done in other places, use the FLOWI_FLAG_SKIP_NH_OIF to know that the RT6_LOOKUP_F_IFACE flag needs to be removed. Fixes: ca254490 ("net: Add VRF support to IPv6 stack") Reported-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Miguel reported an skb use after free / double free in vrf_finish_output when neigh_output returns an error. The vrf driver should return after the call to neigh_output as it takes over the skb on error path as well. Patch is a simplified version of Miguel's patch which was written for 4.9, and updated to top of tree. Fixes: 8f58336d ("net: Add ethernet header for pass through VRF device") Signed-off-by: Miguel Fadon Perlines <mfadon@teldat.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Lebrun authored
Enabling TSO can lead to abysmal performances when using seg6 in encap mode, such as with the ixgbe driver. This patch adds a call to iptunnel_handle_offloads() to remove the encapsulation bit if needed. Before: root@comp4-seg6bpf:~# iperf3 -c fc00::55 Connecting to host fc00::55, port 5201 [ 4] local fc45::4 port 36592 connected to fc00::55 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 196 KBytes 1.60 Mbits/sec 47 6.66 KBytes [ 4] 1.00-2.00 sec 304 KBytes 2.49 Mbits/sec 100 5.33 KBytes [ 4] 2.00-3.00 sec 284 KBytes 2.32 Mbits/sec 92 5.33 KBytes After: root@comp4-seg6bpf:~# iperf3 -c fc00::55 Connecting to host fc00::55, port 5201 [ 4] local fc45::4 port 43062 connected to fc00::55 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 1.03 GBytes 8.89 Gbits/sec 0 743 KBytes [ 4] 1.00-2.00 sec 1.03 GBytes 8.87 Gbits/sec 0 743 KBytes [ 4] 2.00-3.00 sec 1.03 GBytes 8.87 Gbits/sec 0 743 KBytes Reported-by: Tom Herbert <tom@quantonium.net> Fixes: 6c8702c6 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels") Signed-off-by: David Lebrun <dlebrun@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://github.com/ceph/ceph-clientLinus Torvalds authored
Pull ceph fix from Ilya Dryomov: "A fix for a dio-enabled loop on ceph deadlock from Zheng, marked for stable" * tag 'ceph-for-4.16-rc8' of git://github.com/ceph/ceph-client: ceph: only dirty ITER_IOVEC pages for direct read
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM fixes from Radim Krčmář: "PPC: - Fix a bug causing occasional machine check exceptions on POWER8 hosts (introduced in 4.16-rc1) x86: - Fix a guest crashing regression with nested VMX and restricted guest (introduced in 4.16-rc1) - Fix dependency check for pv tlb flush (the wrong dependency that effectively disabled the feature was added in 4.16-rc4, the original feature in 4.16-rc1, so it got decent testing)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix pv tlb flush dependencies KVM: nVMX: sync vmcs02 segment regs prior to vmx_set_cr0 KVM: PPC: Book3S HV: Fix duplication of host SLB entries
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds authored
Pull i2c fix from Wolfram Sang: "A simple but worthwhile I2C driver fix for 4.16" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: i2c-stm32f7: fix no check on returned setup
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds authored
Pull sound fixes from Takashi Iwai: "Very small fixes (all one-liners) at this time. One fix is for a PCM core stuff to correct the mmap behavior on non-x86. It doesn't show on most machines but mostly only for exotic non-interleaved formats" * tag 'sound-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: pcm: potential uninitialized return values ALSA: pcm: Use dma_bytes as size parameter in dma_mmap_coherent() ALSA: usb-audio: Add native DSD support for TEAC UD-301
-
Tal Gilboa authored
When calculating difference between samples, the values are multiplied by 100. Large values may cause int overflow when multiplied (usually on first iteration). Fixed by forcing 100 to be of type unsigned long. Fixes: 4c4dbb4a ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Toshiaki Makita says: ==================== Fix vlan tag handling for vlan packets without ethernet headers Eric Dumazet reported syzbot found a new bug which leads to underflow of size argument of memmove(), causing crash[1]. This can be triggered by tun devices. The underflow happened because skb_vlan_untag() did not expect vlan packets without ethernet headers, and tun can produce such packets. I also checked vlan_insert_inner_tag() and found a similar bug. This series fixes these problems. [1] https://marc.info/?l=linux-netdev&m=152221753920510&w=2 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Toshiaki Makita authored
In some situation vlan packets do not have ethernet headers. One example is packets from tun devices. Users can specify vlan protocol in tun_pi field instead of IP protocol. When we have a vlan device with reorder_hdr disabled on top of the tun device, such packets from tun devices are untagged in skb_vlan_untag() and vlan headers will be inserted back in vlan_insert_inner_tag(). vlan_insert_inner_tag() however did not expect packets without ethernet headers, so in such a case size argument for memmove() underflowed. We don't need to copy headers for packets which do not have preceding headers of vlan headers, so skip memmove() in that case. Also don't write vlan protocol in skb->data when it does not have enough room for it. Fixes: cbe7128c ("vlan: Fix out of order vlan headers with reorder header off") Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Toshiaki Makita authored
In some situation vlan packets do not have ethernet headers. One example is packets from tun devices. Users can specify vlan protocol in tun_pi field instead of IP protocol, and skb_vlan_untag() attempts to untag such packets. skb_vlan_untag() (more precisely, skb_reorder_vlan_header() called by it) however did not expect packets without ethernet headers, so in such a case size argument for memmove() underflowed and triggered crash. ==== BUG: unable to handle kernel paging request at ffff8801cccb8000 IP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43 PGD 9cee067 P4D 9cee067 PUD 1d9401063 PMD 1cccb7063 PTE 2810100028101 Oops: 000b [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 17663 Comm: syz-executor2 Not tainted 4.16.0-rc7+ #368 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43 RSP: 0018:ffff8801cc046e28 EFLAGS: 00010287 RAX: ffff8801ccc244c4 RBX: fffffffffffffffe RCX: fffffffffff6c4c2 RDX: fffffffffffffffe RSI: ffff8801cccb7ffc RDI: ffff8801cccb8000 RBP: ffff8801cc046e48 R08: ffff8801ccc244be R09: ffffed0039984899 R10: 0000000000000001 R11: ffffed0039984898 R12: ffff8801ccc244c4 R13: ffff8801ccc244c0 R14: ffff8801d96b7c06 R15: ffff8801d96b7b40 FS: 00007febd562d700(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8801cccb8000 CR3: 00000001ccb2f006 CR4: 00000000001606e0 DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Call Trace: memmove include/linux/string.h:360 [inline] skb_reorder_vlan_header net/core/skbuff.c:5031 [inline] skb_vlan_untag+0x470/0xc40 net/core/skbuff.c:5061 __netif_receive_skb_core+0x119c/0x3460 net/core/dev.c:4460 __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4627 netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4701 netif_receive_skb+0xae/0x390 net/core/dev.c:4725 tun_rx_batched.isra.50+0x5ee/0x870 drivers/net/tun.c:1555 tun_get_user+0x299e/0x3c20 drivers/net/tun.c:1962 tun_chr_write_iter+0xb9/0x160 drivers/net/tun.c:1990 call_write_iter include/linux/fs.h:1782 [inline] new_sync_write fs/read_write.c:469 [inline] __vfs_write+0x684/0x970 fs/read_write.c:482 vfs_write+0x189/0x510 fs/read_write.c:544 SYSC_write fs/read_write.c:589 [inline] SyS_write+0xef/0x220 fs/read_write.c:581 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x454879 RSP: 002b:00007febd562cc68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007febd562d6d4 RCX: 0000000000454879 RDX: 0000000000000157 RSI: 0000000020000180 RDI: 0000000000000014 RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000000006b0 R14: 00000000006fc120 R15: 0000000000000000 Code: 90 90 90 90 90 90 90 48 89 f8 48 83 fa 20 0f 82 03 01 00 00 48 39 fe 7d 0f 49 89 f0 49 01 d0 49 39 f8 0f 8f 9f 00 00 00 48 89 d1 <f3> a4 c3 48 81 fa a8 02 00 00 72 05 40 38 fe 74 3b 48 83 ea 20 RIP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43 RSP: ffff8801cc046e28 CR2: ffff8801cccb8000 ==== We don't need to copy headers for packets which do not have preceding headers of vlan headers, so skip memmove() in that case. Fixes: 4bbb3e0e ("net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
Trivial fix to spelling mistake in message text Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yan, Zheng authored
If a page is already locked, attempting to dirty it leads to a deadlock in lock_page(). This is what currently happens to ITER_BVEC pages when a dio-enabled loop device is backed by ceph: $ losetup --direct-io /dev/loop0 /mnt/cephfs/img $ xfs_io -c 'pread 0 4k' /dev/loop0 Follow other file systems and only dirty ITER_IOVEC pages. Cc: stable@kernel.org Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-