- 14 Mar, 2019 40 commits
-
-
Christian Borntraeger authored
BugLink: https://bugs.launchpad.net/bugs/1818803 commit 03aa047e upstream. Right now the early machine detection code check stsi 3.2.2 for "KVM" and set MACHINE_IS_VM if this is different. As the console detection uses diagnose 8 if MACHINE_IS_VM returns true this will crash Linux early for any non z/VM system that sets a different value than KVM. So instead of assuming z/VM, do not set any of MACHINE_IS_LPAR, MACHINE_IS_VM, or MACHINE_IS_KVM. CC: stable@vger.kernel.org Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Eugeniy Paltsev authored
BugLink: https://bugs.launchpad.net/bugs/1818803 commit 3affbf0e upstream. So far we've mapped branches to "ijmp" which also counts conditional branches NOT taken. This makes us different from other architectures such as ARM which seem to be counting only taken branches. So use "ijmptak" hardware condition which only counts (all jump instructions that are taken) 'ijmptak' event is available on both ARCompact and ARCv2 ISA based cores. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Cc: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: reworked changelog] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Kangjie Lu authored
BugLink: https://bugs.launchpad.net/bugs/1818803 commit 44fabd8c upstream. snd_pcm_lib_malloc_pages() may fail, so let's check its status and return its error code upstream. Signed-off-by: Kangjie Lu <kjlu@umn.edu> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Charles Yeh authored
BugLink: https://bugs.launchpad.net/bugs/1818803 commit 4dcf9ddc upstream. Add new PID to support PL2303TB (TYPE_HX) Signed-off-by: Charles Yeh <charlesyeh522@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Max Schulze authored
BugLink: https://bugs.launchpad.net/bugs/1818803 commit b81c2c33 upstream. Add new Motorola Tetra device id for Motorola Solutions TETRA PEI device T: Bus=02 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 4 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=0cad ProdID=9016 Rev=24.16 S: Manufacturer=Motorola Solutions, Inc. S: Product=TETRA PEI interface C: #Ifs= 2 Cfg#= 1 Atr=80 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=usb_serial_simple I: If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=usb_serial_simple Signed-off-by: Max Schulze <max.schulze@posteo.de> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Yunjian Wang authored
BugLink: https://bugs.launchpad.net/bugs/1818803 [ Upstream commit 28c1382f ] The skb header should be set to ethernet header before using is_skb_forwardable. Because the ethernet header length has been considered in is_skb_forwardable(including dev->hard_header_len length). To reproduce the issue: 1, add 2 ports on linux bridge br using following commands: $ brctl addbr br $ brctl addif br eth0 $ brctl addif br eth1 2, the MTU of eth0 and eth1 is 1500 3, send a packet(Data 1480, UDP 8, IP 20, Ethernet 14, VLAN 4) from eth0 to eth1 So the expect result is packet larger than 1500 cannot pass through eth0 and eth1. But currently, the packet passes through success, it means eth1's MTU limit doesn't take effect. Fixes: f6367b46 ("bridge: use is_skb_forwardable in forward path") Cc: bridge@lists.linux-foundation.org Cc: Nkolay Aleksandrov <nikolay@cumulusnetworks.com> Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Cong Wang authored
BugLink: https://bugs.launchpad.net/bugs/1818803 [ Upstream commit cd0c4e70 ] Martin reported a set of filters don't work after changing from reclassify to continue. Looking into the code, it looks like skb protocol is not always fetched for each iteration of the filters. But, as demonstrated by Martin, TC actions could modify skb->protocol, for example act_vlan, this means we have to refetch skb protocol in each iteration, rather than using the one we fetch in the beginning of the loop. This bug is _not_ introduced by commit 3b3ae880 ("net: sched: consolidate tc_classify{,_compat}"), technically, if act_vlan is the only action that modifies skb protocol, then it is commit c7e2b968 ("sched: introduce vlan action") which introduced this bug. Reported-by: Martin Olsson <martin.olsson+netdev@sentorsecurity.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Ido Schimmel authored
BugLink: https://bugs.launchpad.net/bugs/1818803 [ Upstream commit f97f4dd8 ] IPv4 routing tables are flushed in two cases: 1. In response to events in the netdev and inetaddr notification chains 2. When a network namespace is being dismantled In both cases only routes associated with a dead nexthop group are flushed. However, a nexthop group will only be marked as dead in case it is populated with actual nexthops using a nexthop device. This is not the case when the route in question is an error route (e.g., 'blackhole', 'unreachable'). Therefore, when a network namespace is being dismantled such routes are not flushed and leaked [1]. To reproduce: # ip netns add blue # ip -n blue route add unreachable 192.0.2.0/24 # ip netns del blue Fix this by not skipping error routes that are not marked with RTNH_F_DEAD when flushing the routing tables. To prevent the flushing of such routes in case #1, add a parameter to fib_table_flush() that indicates if the table is flushed as part of namespace dismantle or not. Note that this problem does not exist in IPv6 since error routes are associated with the loopback device. [1] unreferenced object 0xffff888066650338 (size 56): comm "ip", pid 1206, jiffies 4294786063 (age 26.235s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 b0 1c 62 61 80 88 ff ff ..........ba.... e8 8b a1 64 80 88 ff ff 00 07 00 08 fe 00 00 00 ...d............ backtrace: [<00000000856ed27d>] inet_rtm_newroute+0x129/0x220 [<00000000fcdfc00a>] rtnetlink_rcv_msg+0x397/0xa20 [<00000000cb85801a>] netlink_rcv_skb+0x132/0x380 [<00000000ebc991d2>] netlink_unicast+0x4c0/0x690 [<0000000014f62875>] netlink_sendmsg+0x929/0xe10 [<00000000bac9d967>] sock_sendmsg+0xc8/0x110 [<00000000223e6485>] ___sys_sendmsg+0x77a/0x8f0 [<000000002e94f880>] __sys_sendmsg+0xf7/0x250 [<00000000ccb1fa72>] do_syscall_64+0x14d/0x610 [<00000000ffbe3dae>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<000000003a8b605b>] 0xffffffffffffffff unreferenced object 0xffff888061621c88 (size 48): comm "ip", pid 1206, jiffies 4294786063 (age 26.235s) hex dump (first 32 bytes): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk 6b 6b 6b 6b 6b 6b 6b 6b d8 8e 26 5f 80 88 ff ff kkkkkkkk..&_.... backtrace: [<00000000733609e3>] fib_table_insert+0x978/0x1500 [<00000000856ed27d>] inet_rtm_newroute+0x129/0x220 [<00000000fcdfc00a>] rtnetlink_rcv_msg+0x397/0xa20 [<00000000cb85801a>] netlink_rcv_skb+0x132/0x380 [<00000000ebc991d2>] netlink_unicast+0x4c0/0x690 [<0000000014f62875>] netlink_sendmsg+0x929/0xe10 [<00000000bac9d967>] sock_sendmsg+0xc8/0x110 [<00000000223e6485>] ___sys_sendmsg+0x77a/0x8f0 [<000000002e94f880>] __sys_sendmsg+0xf7/0x250 [<00000000ccb1fa72>] do_syscall_64+0x14d/0x610 [<00000000ffbe3dae>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<000000003a8b605b>] 0xffffffffffffffff Fixes: 8cced9ef ("[NETNS]: Enable routing configuration in non-initial namespace.") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Ross Lagerwall authored
BugLink: https://bugs.launchpad.net/bugs/1818803 [ Upstream commit 04a4af33 ] For nested and variable attributes, the expected length of an attribute is not known and marked by a negative number. This results in an OOB read when the expected length is later used to check if the attribute is all zeros. Fix this by using the actual length of the attribute rather than the expected length. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Ross Lagerwall authored
BugLink: https://bugs.launchpad.net/bugs/1818803 [ Upstream commit 6c57f045 ] In certain cases, pskb_trim_rcsum() may change skb pointers. Reinitialize header pointers afterwards to avoid potential use-after-frees. Add a note in the documentation of pskb_trim_rcsum(). Found by KASAN. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Greg Kroah-Hartman authored
BugLink: https://bugs.launchpad.net/bugs/1818797Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Corey Minyard authored
BugLink: https://bugs.launchpad.net/bugs/1818797 commit 7d6380cd upstream. The block number was not being compared right, it was off by one when checking the response. Some statistics wouldn't be incremented properly in some cases. Check to see if that middle-part messages always have 31 bytes of data. Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: stable@vger.kernel.org # 4.4 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Eric Dumazet authored
BugLink: https://bugs.launchpad.net/bugs/1818797 commit 7c90584c upstream. As measured in my prior patch ("sch_netem: faster rb tree removal"), rbtree_postorder_for_each_entry_safe() is nice looking but much slower than using rb_next() directly, except when tree is small enough to fit in CPU caches (then the cost is the same) Also note that there is not even an increase of text size : $ size net/core/skbuff.o.before net/core/skbuff.o text data bss dec hex filename 40711 1298 0 42009 a419 net/core/skbuff.o.before 40711 1298 0 42009 a419 net/core/skbuff.o From: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Michal Hocko authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 7550c607 ] Patch series "THP eligibility reporting via proc". This series of three patches aims at making THP eligibility reporting much more robust and long term sustainable. The trigger for the change is a regression report [2] and the long follow up discussion. In short the specific application didn't have good API to query whether a particular mapping can be backed by THP so it has used VMA flags to workaround that. These flags represent a deep internal state of VMAs and as such they should be used by userspace with a great deal of caution. A similar has happened for [3] when users complained that VM_MIXEDMAP is no longer set on DAX mappings. Again a lack of a proper API led to an abuse. The first patch in the series tries to emphasise that that the semantic of flags might change and any application consuming those should be really careful. The remaining two patches provide a more suitable interface to address [2] and provide a consistent API to query the THP status both for each VMA and process wide as well. [1] http://lkml.kernel.org/r/20181120103515.25280-1-mhocko@kernel.org [2] http://lkml.kernel.org/r/http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com [3] http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz This patch (of 3): Even though vma flags exported via /proc/<pid>/smaps are explicitly documented to be not guaranteed for future compatibility the warning doesn't go far enough because it doesn't mention semantic changes to those flags. And they are important as well because these flags are a deep implementation internal to the MM code and the semantic might change at any time. Let's consider two recent examples: http://lkml.kernel.org/r/20181002100531.GC4135@quack2.suse.cz : commit e1fb4a08 "dax: remove VM_MIXEDMAP for fsdax and device dax" has : removed VM_MIXEDMAP flag from DAX VMAs. Now our testing shows that in the : mean time certain customer of ours started poking into /proc/<pid>/smaps : and looks at VMA flags there and if VM_MIXEDMAP is missing among the VMA : flags, the application just fails to start complaining that DAX support is : missing in the kernel. http://lkml.kernel.org/r/alpine.DEB.2.21.1809241054050.224429@chino.kir.corp.google.com : Commit 18600332 ("mm: make PR_SET_THP_DISABLE immediately active") : introduced a regression in that userspace cannot always determine the set : of vmas where thp is ineligible. : Userspace relies on the "nh" flag being emitted as part of /proc/pid/smaps : to determine if a vma is eligible to be backed by hugepages. : Previous to this commit, prctl(PR_SET_THP_DISABLE, 1) would cause thp to : be disabled and emit "nh" as a flag for the corresponding vmas as part of : /proc/pid/smaps. After the commit, thp is disabled by means of an mm : flag and "nh" is not emitted. : This causes smaps parsing libraries to assume a vma is eligible for thp : and ends up puzzling the user on why its memory is not backed by thp. In both cases userspace was relying on a semantic of a specific VMA flag. The primary reason why that happened is a lack of a proper interface. While this has been worked on and it will be fixed properly, it seems that our wording could see some refinement and be more vocal about semantic aspect of these flags as well. Link: http://lkml.kernel.org/r/20181211143641.3503-2-mhocko@kernel.orgSigned-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Paul Oppenheimer <bepvte@gmail.com> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Brian Foster authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 3fa750dc ] write_cache_pages() is used in both background and integrity writeback scenarios by various filesystems. Background writeback is mostly concerned with cleaning a certain number of dirty pages based on various mm heuristics. It may not write the full set of dirty pages or wait for I/O to complete. Integrity writeback is responsible for persisting a set of dirty pages before the writeback job completes. For example, an fsync() call must perform integrity writeback to ensure data is on disk before the call returns. write_cache_pages() unconditionally breaks out of its processing loop in the event of a ->writepage() error. This is fine for background writeback, which had no strict requirements and will eventually come around again. This can cause problems for integrity writeback on filesystems that might need to clean up state associated with failed page writeouts. For example, XFS performs internal delayed allocation accounting before returning a ->writepage() error, where applicable. If the current writeback happens to be associated with an unmount and write_cache_pages() completes the writeback prematurely due to error, the filesystem is unmounted in an inconsistent state if dirty+delalloc pages still exist. To handle this problem, update write_cache_pages() to always process the full set of pages for integrity writeback regardless of ->writepage() errors. Save the first encountered error and return it to the caller once complete. This facilitates XFS (or any other fs that expects integrity writeback to process the entire set of dirty pages) to clean up its internal state completely in the event of persistent mapping errors. Background writeback continues to exit on the first error encountered. [akpm@linux-foundation.org: fix typo in comment] Link: http://lkml.kernel.org/r/20181116134304.32440-1-bfoster@redhat.comSigned-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Junxiao Bi authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 532e1e54 ] mount.ocfs2 ignore the inconsistent error that journal is clean but local alloc is unrecovered. After mount, local alloc not empty, then reserver cluster didn't alloc a new local alloc window, reserveration map is empty(ocfs2_reservation_map.m_bitmap_len = 0), that triggered the following panic. This issue was reported at https://oss.oracle.com/pipermail/ocfs2-devel/2015-May/010854.html and was advised to fixed during mount. But this is a very unusual inconsistent state, usually journal dirty flag should be cleared at the last stage of umount until every other things go right. We may need do further debug to check that. Any way to avoid possible futher corruption, mount should be abort and fsck should be run. (mount.ocfs2,1765,1):ocfs2_load_local_alloc:353 ERROR: Local alloc hasn't been recovered! found = 6518, set = 6518, taken = 8192, off = 15912372 ocfs2: Mounting device (202,64) on (node 0, slot 3) with ordered data mode. o2dlm: Joining domain 89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 8 ) 8 nodes ocfs2: Mounting device (202,80) on (node 0, slot 3) with ordered data mode. o2hb: Region 89CEAC63CC4F4D03AC185B44E0EE0F3F (xvdf) is now a quorum device o2net: Accepted connection from node yvwsoa17p (num 7) at 172.22.77.88:7777 o2dlm: Node 7 joins domain 64FE421C8C984E6D96ED12C55FEE2435 ( 0 1 2 3 4 5 6 7 8 ) 9 nodes o2dlm: Node 7 joins domain 89CEAC63CC4F4D03AC185B44E0EE0F3F ( 0 1 2 3 4 5 6 7 8 ) 9 nodes ------------[ cut here ]------------ kernel BUG at fs/ocfs2/reservations.c:507! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2 rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs fscache lockd grace ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sunrpc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 ovmapi ppdev parport_pc parport xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea acpi_cpufreq pcspkr i2c_piix4 i2c_core sg ext4 jbd2 mbcache2 sr_mod cdrom xen_blkfront pata_acpi ata_generic ata_piix floppy dm_mirror dm_region_hash dm_log dm_mod CPU: 0 PID: 4349 Comm: startWebLogic.s Not tainted 4.1.12-124.19.2.el6uek.x86_64 #2 Hardware name: Xen HVM domU, BIOS 4.4.4OVM 09/06/2018 task: ffff8803fb04e200 ti: ffff8800ea4d8000 task.ti: ffff8800ea4d8000 RIP: 0010:[<ffffffffa05e96a8>] [<ffffffffa05e96a8>] __ocfs2_resv_find_window+0x498/0x760 [ocfs2] Call Trace: ocfs2_resmap_resv_bits+0x10d/0x400 [ocfs2] ocfs2_claim_local_alloc_bits+0xd0/0x640 [ocfs2] __ocfs2_claim_clusters+0x178/0x360 [ocfs2] ocfs2_claim_clusters+0x1f/0x30 [ocfs2] ocfs2_convert_inline_data_to_extents+0x634/0xa60 [ocfs2] ocfs2_write_begin_nolock+0x1c6/0x1da0 [ocfs2] ocfs2_write_begin+0x13e/0x230 [ocfs2] generic_perform_write+0xbf/0x1c0 __generic_file_write_iter+0x19c/0x1d0 ocfs2_file_write_iter+0x589/0x1360 [ocfs2] __vfs_write+0xb8/0x110 vfs_write+0xa9/0x1b0 SyS_write+0x46/0xb0 system_call_fastpath+0x18/0xd7 Code: ff ff 8b 75 b8 39 75 b0 8b 45 c8 89 45 98 0f 84 e5 fe ff ff 45 8b 74 24 18 41 8b 54 24 1c e9 56 fc ff ff 85 c0 0f 85 48 ff ff ff <0f> 0b 48 8b 05 cf c3 de ff 48 ba 00 00 00 00 00 00 00 10 48 85 RIP __ocfs2_resv_find_window+0x498/0x760 [ocfs2] RSP <ffff8800ea4db668> ---[ end trace 566f07529f2edf3c ]--- Kernel panic - not syncing: Fatal exception Kernel Offset: disabled Link: http://lkml.kernel.org/r/20181121020023.3034-2-junxiao.bi@oracle.comSigned-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com> Acked-by: Joseph Qi <jiangqi903@gmail.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Changwei Ge <ge.changwei@h3c.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Qian Cai authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit c7a082e4 ] UBSAN reported those with MegaRAID SAS-3 3108, [ 77.467308] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32 [ 77.475402] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]' [ 77.481677] CPU: 16 PID: 333 Comm: kworker/16:1 Not tainted 4.20.0-rc5+ #1 [ 77.488556] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [ 77.495791] Workqueue: events work_for_cpu_fn [ 77.500154] Call trace: [ 77.502610] dump_backtrace+0x0/0x2c8 [ 77.506279] show_stack+0x24/0x30 [ 77.509604] dump_stack+0x118/0x19c [ 77.513098] ubsan_epilogue+0x14/0x60 [ 77.516765] __ubsan_handle_out_of_bounds+0xfc/0x13c [ 77.521767] mr_update_load_balance_params+0x150/0x158 [megaraid_sas] [ 77.528230] MR_ValidateMapInfo+0x2cc/0x10d0 [megaraid_sas] [ 77.533825] megasas_get_map_info+0x244/0x2f0 [megaraid_sas] [ 77.539505] megasas_init_adapter_fusion+0x9b0/0xf48 [megaraid_sas] [ 77.545794] megasas_init_fw+0x1ab4/0x3518 [megaraid_sas] [ 77.551212] megasas_probe_one+0x2c4/0xbe0 [megaraid_sas] [ 77.556614] local_pci_probe+0x7c/0xf0 [ 77.560365] work_for_cpu_fn+0x34/0x50 [ 77.564118] process_one_work+0x61c/0xf08 [ 77.568129] worker_thread+0x534/0xa70 [ 77.571882] kthread+0x1c8/0x1d0 [ 77.575114] ret_from_fork+0x10/0x1c [ 89.240332] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32 [ 89.248426] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]' [ 89.254700] CPU: 16 PID: 95 Comm: kworker/u130:0 Not tainted 4.20.0-rc5+ #1 [ 89.261665] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018 [ 89.268903] Workqueue: events_unbound async_run_entry_fn [ 89.274222] Call trace: [ 89.276680] dump_backtrace+0x0/0x2c8 [ 89.280348] show_stack+0x24/0x30 [ 89.283671] dump_stack+0x118/0x19c [ 89.287167] ubsan_epilogue+0x14/0x60 [ 89.290835] __ubsan_handle_out_of_bounds+0xfc/0x13c [ 89.295828] MR_LdRaidGet+0x50/0x58 [megaraid_sas] [ 89.300638] megasas_build_io_fusion+0xbb8/0xd90 [megaraid_sas] [ 89.306576] megasas_build_and_issue_cmd_fusion+0x138/0x460 [megaraid_sas] [ 89.313468] megasas_queue_command+0x398/0x3d0 [megaraid_sas] [ 89.319222] scsi_dispatch_cmd+0x1dc/0x8a8 [ 89.323321] scsi_request_fn+0x8e8/0xdd0 [ 89.327249] __blk_run_queue+0xc4/0x158 [ 89.331090] blk_execute_rq_nowait+0xf4/0x158 [ 89.335449] blk_execute_rq+0xdc/0x158 [ 89.339202] __scsi_execute+0x130/0x258 [ 89.343041] scsi_probe_and_add_lun+0x2fc/0x1488 [ 89.347661] __scsi_scan_target+0x1cc/0x8c8 [ 89.351848] scsi_scan_channel.part.3+0x8c/0xc0 [ 89.356382] scsi_scan_host_selected+0x130/0x1f0 [ 89.361002] do_scsi_scan_host+0xd8/0xf0 [ 89.364927] do_scan_async+0x9c/0x320 [ 89.368594] async_run_entry_fn+0x138/0x420 [ 89.372780] process_one_work+0x61c/0xf08 [ 89.376793] worker_thread+0x13c/0xa70 [ 89.380546] kthread+0x1c8/0x1d0 [ 89.383778] ret_from_fork+0x10/0x1c This is because when populating Driver Map using firmware raid map, all non-existing VDs set their ldTgtIdToLd to 0xff, so it can be skipped later. From drivers/scsi/megaraid/megaraid_sas_base.c , memset(instance->ld_ids, 0xff, MEGASAS_MAX_LD_IDS); From drivers/scsi/megaraid/megaraid_sas_fp.c , /* For non existing VDs, iterate to next VD*/ if (ld >= (MAX_LOGICAL_DRIVES_EXT - 1)) continue; However, there are a few places that failed to skip those non-existing VDs due to off-by-one errors. Then, those 0xff leaked into MR_LdRaidGet(0xff, map) and triggered the out-of-bound accesses. Fixes: 51087a86 ("megaraid_sas : Extended VD support") Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Daniel Vetter authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 4f4b3743 ] This is the much more correct fix for my earlier attempt at: https://lkml.org/lkml/2018/12/10/118 Short recap: - There's not actually a locking issue, it's just lockdep being a bit too eager to complain about a possible deadlock. - Contrary to what I claimed the real problem is recursion on kn->count. Greg pointed me at sysfs_break_active_protection(), used by the scsi subsystem to allow a sysfs file to unbind itself. That would be a real deadlock, which isn't what's happening here. Also, breaking the active protection means we'd need to manually handle all the lifetime fun. - With Rafael we discussed the task_work approach, which kinda works, but has two downsides: It's a functional change for a lockdep annotation issue, and it won't work for the bind file (which needs to get the errno from the driver load function back to userspace). - Greg also asked why this never showed up: To hit this you need to unregister a 2nd driver from the unload code of your first driver. I guess only gpus do that. The bug has always been there, but only with a recent patch series did we add more locks so that lockdep built a chain from unbinding the snd-hda driver to the acpi_video_unregister call. Full lockdep splat: [12301.898799] ============================================ [12301.898805] WARNING: possible recursive locking detected [12301.898811] 4.20.0-rc7+ #84 Not tainted [12301.898815] -------------------------------------------- [12301.898821] bash/5297 is trying to acquire lock: [12301.898826] 00000000f61c6093 (kn->count#39){++++}, at: kernfs_remove_by_name_ns+0x3b/0x80 [12301.898841] but task is already holding lock: [12301.898847] 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190 [12301.898856] other info that might help us debug this: [12301.898862] Possible unsafe locking scenario: [12301.898867] CPU0 [12301.898870] ---- [12301.898874] lock(kn->count#39); [12301.898879] lock(kn->count#39); [12301.898883] *** DEADLOCK *** [12301.898891] May be due to missing lock nesting notation [12301.898899] 5 locks held by bash/5297: [12301.898903] #0: 00000000cd800e54 (sb_writers#4){.+.+}, at: vfs_write+0x17f/0x1b0 [12301.898915] #1: 000000000465e7c2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xd3/0x190 [12301.898925] #2: 000000005f634021 (kn->count#39){++++}, at: kernfs_fop_write+0xdc/0x190 [12301.898936] #3: 00000000414ef7ac (&dev->mutex){....}, at: device_release_driver_internal+0x34/0x240 [12301.898950] #4: 000000003218fbdf (register_count_mutex){+.+.}, at: acpi_video_unregister+0xe/0x40 [12301.898960] stack backtrace: [12301.898968] CPU: 1 PID: 5297 Comm: bash Not tainted 4.20.0-rc7+ #84 [12301.898974] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.01 03/11/2011 [12301.898982] Call Trace: [12301.898989] dump_stack+0x67/0x9b [12301.898997] __lock_acquire+0x6ad/0x1410 [12301.899003] ? kernfs_remove_by_name_ns+0x3b/0x80 [12301.899010] ? find_held_lock+0x2d/0x90 [12301.899017] ? mutex_spin_on_owner+0xe4/0x150 [12301.899023] ? find_held_lock+0x2d/0x90 [12301.899030] ? lock_acquire+0x90/0x180 [12301.899036] lock_acquire+0x90/0x180 [12301.899042] ? kernfs_remove_by_name_ns+0x3b/0x80 [12301.899049] __kernfs_remove+0x296/0x310 [12301.899055] ? kernfs_remove_by_name_ns+0x3b/0x80 [12301.899060] ? kernfs_name_hash+0xd/0x80 [12301.899066] ? kernfs_find_ns+0x6c/0x100 [12301.899073] kernfs_remove_by_name_ns+0x3b/0x80 [12301.899080] bus_remove_driver+0x92/0xa0 [12301.899085] acpi_video_unregister+0x24/0x40 [12301.899127] i915_driver_unload+0x42/0x130 [i915] [12301.899160] i915_pci_remove+0x19/0x30 [i915] [12301.899169] pci_device_remove+0x36/0xb0 [12301.899176] device_release_driver_internal+0x185/0x240 [12301.899183] unbind_store+0xaf/0x180 [12301.899189] kernfs_fop_write+0x104/0x190 [12301.899195] __vfs_write+0x31/0x180 [12301.899203] ? rcu_read_lock_sched_held+0x6f/0x80 [12301.899209] ? rcu_sync_lockdep_assert+0x29/0x50 [12301.899216] ? __sb_start_write+0x13c/0x1a0 [12301.899221] ? vfs_write+0x17f/0x1b0 [12301.899227] vfs_write+0xb9/0x1b0 [12301.899233] ksys_write+0x50/0xc0 [12301.899239] do_syscall_64+0x4b/0x180 [12301.899247] entry_SYSCALL_64_after_hwframe+0x49/0xbe [12301.899253] RIP: 0033:0x7f452ac7f7a4 [12301.899259] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 aa f0 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83 [12301.899273] RSP: 002b:00007ffceafa6918 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [12301.899282] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f452ac7f7a4 [12301.899288] RDX: 000000000000000d RSI: 00005612a1abf7c0 RDI: 0000000000000001 [12301.899295] RBP: 00005612a1abf7c0 R08: 000000000000000a R09: 00005612a1c46730 [12301.899301] R10: 000000000000000a R11: 0000000000000246 R12: 000000000000000d [12301.899308] R13: 0000000000000001 R14: 00007f452af4a740 R15: 000000000000000d Looking around I've noticed that usb and i2c already handle similar recursion problems, where a sysfs file can unbind the same type of sysfs somewhere else in the hierarchy. Relevant commits are: commit 356c05d5 Author: Alan Stern <stern@rowland.harvard.edu> Date: Mon May 14 13:30:03 2012 -0400 sysfs: get rid of some lockdep false positives commit e9b526fe Author: Alexander Sverdlin <alexander.sverdlin@nsn.com> Date: Fri May 17 14:56:35 2013 +0200 i2c: suppress lockdep warning on delete_device Implement the same trick for driver bind/unbind. v2: Put the macro into bus.c (Greg). Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Arend van Spriel <aspriel@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Bartosz Golaszewski <brgl@bgdev.pl> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: Vivek Gautam <vivek.gautam@codeaurora.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Takashi Sakamoto authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 644b2e97 ] This commit fixes hard-coded model-id for an unit of Apogee Ensemble with a correct value. This unit uses DM1500 ASIC produced ArchWave AG (formerly known as BridgeCo AG). I note that this model supports three modes in the number of data channels in tx/rx streams; 8 ch pairs, 10 ch pairs, 18 ch pairs. The mode is switched by Vendor-dependent AV/C command, like: $ cd linux-firewire-utils $ ./firewire-request /dev/fw1 fcp 0x00ff000003dbeb0600000000 (8ch pairs) $ ./firewire-request /dev/fw1 fcp 0x00ff000003dbeb0601000000 (10ch pairs) $ ./firewire-request /dev/fw1 fcp 0x00ff000003dbeb0602000000 (18ch pairs) When switching between different mode, the unit disappears from IEEE 1394 bus, then appears on the bus with different combination of stream formats. In a mode of 18 ch pairs, available sampling rate is up to 96.0 kHz, else up to 192.0 kHz. $ ./hinawa-config-rom-printer /dev/fw1 { 'bus-info': { 'adj': False, 'bmc': True, 'chip_ID': 21474898341, 'cmc': True, 'cyc_clk_acc': 100, 'generation': 2, 'imc': True, 'isc': True, 'link_spd': 2, 'max_ROM': 1, 'max_rec': 512, 'name': '1394', 'node_vendor_ID': 987, 'pmc': False}, 'root-directory': [ ['HARDWARE_VERSION', 19], [ 'NODE_CAPABILITIES', { 'addressing': {'64': True, 'fix': True, 'prv': False}, 'misc': {'int': False, 'ms': False, 'spt': True}, 'state': { 'atn': False, 'ded': False, 'drq': True, 'elo': False, 'init': False, 'lst': True, 'off': False}, 'testing': {'bas': False, 'ext': False}}], ['VENDOR', 987], ['DESCRIPTOR', 'Apogee Electronics'], ['MODEL', 126702], ['DESCRIPTOR', 'Ensemble'], ['VERSION', 5297], [ 'UNIT', [ ['SPECIFIER_ID', 41005], ['VERSION', 65537], ['MODEL', 126702], ['DESCRIPTOR', 'Ensemble']]], [ 'DEPENDENT_INFO', [ ['SPECIFIER_ID', 2037], ['VERSION', 1], [(58, 'IMMEDIATE'), 16777159], [(59, 'IMMEDIATE'), 1048576], [(60, 'IMMEDIATE'), 16777159], [(61, 'IMMEDIATE'), 6291456]]]]} Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Nikos Tsironis authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 721b1d98 ] kcopyd has no upper limit to the number of jobs one can allocate and issue. Under certain workloads this can lead to excessive memory usage and workqueue stalls. For example, when creating multiple dm-snapshot targets with a 4K chunk size and then writing to the origin through the page cache. Syncing the page cache causes a large number of BIOs to be issued to the dm-snapshot origin target, which itself issues an even larger (because of the BIO splitting taking place) number of kcopyd jobs. Running the following test, from the device mapper test suite [1], dmtest run --suite snapshot -n many_snapshots_of_same_volume_N , with 8 active snapshots, results in the kcopyd job slab cache growing to 10G. Depending on the available system RAM this can lead to the OOM killer killing user processes: [463.492878] kthreadd invoked oom-killer: gfp_mask=0x6040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null), order=1, oom_score_adj=0 [463.492894] kthreadd cpuset=/ mems_allowed=0 [463.492948] CPU: 7 PID: 2 Comm: kthreadd Not tainted 4.19.0-rc7 #3 [463.492950] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [463.492952] Call Trace: [463.492964] dump_stack+0x7d/0xbb [463.492973] dump_header+0x6b/0x2fc [463.492987] ? lockdep_hardirqs_on+0xee/0x190 [463.493012] oom_kill_process+0x302/0x370 [463.493021] out_of_memory+0x113/0x560 [463.493030] __alloc_pages_slowpath+0xf40/0x1020 [463.493055] __alloc_pages_nodemask+0x348/0x3c0 [463.493067] cache_grow_begin+0x81/0x8b0 [463.493072] ? cache_grow_begin+0x874/0x8b0 [463.493078] fallback_alloc+0x1e4/0x280 [463.493092] kmem_cache_alloc_node+0xd6/0x370 [463.493098] ? copy_process.part.31+0x1c5/0x20d0 [463.493105] copy_process.part.31+0x1c5/0x20d0 [463.493115] ? __lock_acquire+0x3cc/0x1550 [463.493121] ? __switch_to_asm+0x34/0x70 [463.493129] ? kthread_create_worker_on_cpu+0x70/0x70 [463.493135] ? finish_task_switch+0x90/0x280 [463.493165] _do_fork+0xe0/0x6d0 [463.493191] ? kthreadd+0x19f/0x220 [463.493233] kernel_thread+0x25/0x30 [463.493235] kthreadd+0x1bf/0x220 [463.493242] ? kthread_create_on_cpu+0x90/0x90 [463.493248] ret_from_fork+0x3a/0x50 [463.493279] Mem-Info: [463.493285] active_anon:20631 inactive_anon:4831 isolated_anon:0 [463.493285] active_file:80216 inactive_file:80107 isolated_file:435 [463.493285] unevictable:0 dirty:51266 writeback:109372 unstable:0 [463.493285] slab_reclaimable:31191 slab_unreclaimable:3483521 [463.493285] mapped:526 shmem:4903 pagetables:1759 bounce:0 [463.493285] free:33623 free_pcp:2392 free_cma:0 ... [463.493489] Unreclaimable slab info: [463.493513] Name Used Total [463.493522] bio-6 1028KB 1028KB [463.493525] bio-5 1028KB 1028KB [463.493528] dm_snap_pending_exception 236783KB 243789KB [463.493531] dm_exception 41KB 42KB [463.493534] bio-4 1216KB 1216KB [463.493537] bio-3 439396KB 439396KB [463.493539] kcopyd_job 6973427KB 6973427KB ... [463.494340] Out of memory: Kill process 1298 (ruby2.3) score 1 or sacrifice child [463.494673] Killed process 1298 (ruby2.3) total-vm:435740kB, anon-rss:20180kB, file-rss:4kB, shmem-rss:0kB [463.506437] oom_reaper: reaped process 1298 (ruby2.3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB Moreover, issuing a large number of kcopyd jobs results in kcopyd hogging the CPU, while processing them. As a result, processing of work items, queued for execution on the same CPU as the currently running kcopyd thread, is stalled for long periods of time, hurting performance. Running the aforementioned test we get, in dmesg, messages like the following: [67501.194592] BUG: workqueue lockup - pool cpus=4 node=0 flags=0x0 nice=0 stuck for 27s! [67501.195586] Showing busy workqueues and worker pools: [67501.195591] workqueue events: flags=0x0 [67501.195597] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195611] pending: cache_reap [67501.195641] workqueue mm_percpu_wq: flags=0x8 [67501.195645] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195656] pending: vmstat_update [67501.195682] workqueue kblockd: flags=0x18 [67501.195687] pwq 5: cpus=2 node=0 flags=0x0 nice=-20 active=1/256 [67501.195698] pending: blk_timeout_work [67501.195753] workqueue kcopyd: flags=0x8 [67501.195757] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195768] pending: do_work [dm_mod] [67501.195802] workqueue kcopyd: flags=0x8 [67501.195806] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195817] pending: do_work [dm_mod] [67501.195834] workqueue kcopyd: flags=0x8 [67501.195838] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195848] pending: do_work [dm_mod] [67501.195881] workqueue kcopyd: flags=0x8 [67501.195885] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=1/256 [67501.195896] pending: do_work [dm_mod] [67501.195920] workqueue kcopyd: flags=0x8 [67501.195924] pwq 8: cpus=4 node=0 flags=0x0 nice=0 active=2/256 [67501.195935] in-flight: 67:do_work [dm_mod] [67501.195945] pending: do_work [dm_mod] [67501.195961] pool 8: cpus=4 node=0 flags=0x0 nice=0 hung=27s workers=3 idle: 129 23765 The root cause for these issues is the way dm-snapshot uses kcopyd. In particular, the lack of an explicit or implicit limit to the maximum number of in-flight COW jobs. The merging path is not affected because it implicitly limits the in-flight kcopyd jobs to one. Fix these issues by using a semaphore to limit the maximum number of in-flight kcopyd jobs. We grab the semaphore before allocating a new kcopyd job in start_copy() and start_full_bio() and release it after the job finishes in copy_callback(). The initial semaphore value is configurable through a module parameter, to allow fine tuning the maximum number of in-flight COW jobs. Setting this parameter to zero initializes the semaphore to INT_MAX. A default value of 2048 maximum in-flight kcopyd jobs was chosen. This value was decided experimentally as a trade-off between memory consumption, stalling the kernel's workqueues and maintaining a high enough throughput. Re-running the aforementioned test: * Workqueue stalls are eliminated * kcopyd's job slab cache uses a maximum of 130MB * The time taken by the test to write to the snapshot-origin target is reduced from 05m20.48s to 03m26.38s [1] https://github.com/jthornber/device-mapper-test-suiteSigned-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Nikos Tsironis authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit d7e6b8df ] When using kcopyd to run callbacks through dm_kcopyd_do_callback() or submitting copy jobs with a source size of 0, the jobs are pushed directly to the complete_jobs list, which could be under processing by the kcopyd thread. As a result, the kcopyd thread can continue running completed jobs indefinitely, without releasing the CPU, as long as someone keeps submitting new completed jobs through the aforementioned paths. Processing of work items, queued for execution on the same CPU as the currently running kcopyd thread, is thus stalled for excessive amounts of time, hurting performance. Running the following test, from the device mapper test suite [1], dmtest run --suite snapshot -n parallel_io_to_many_snaps_N , with 8 active snapshots, we get, in dmesg, messages like the following: [68899.948523] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 95s! [68899.949282] Showing busy workqueues and worker pools: [68899.949288] workqueue events: flags=0x0 [68899.949295] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 [68899.949306] pending: vmstat_shepherd, cache_reap [68899.949331] workqueue mm_percpu_wq: flags=0x8 [68899.949337] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [68899.949345] pending: vmstat_update [68899.949387] workqueue dm_bufio_cache: flags=0x8 [68899.949392] pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 [68899.949400] pending: work_fn [dm_bufio] [68899.949423] workqueue kcopyd: flags=0x8 [68899.949429] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [68899.949437] pending: do_work [dm_mod] [68899.949452] workqueue kcopyd: flags=0x8 [68899.949458] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256 [68899.949466] in-flight: 13:do_work [dm_mod] [68899.949474] pending: do_work [dm_mod] [68899.949487] workqueue kcopyd: flags=0x8 [68899.949493] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [68899.949501] pending: do_work [dm_mod] [68899.949515] workqueue kcopyd: flags=0x8 [68899.949521] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [68899.949529] pending: do_work [dm_mod] [68899.949541] workqueue kcopyd: flags=0x8 [68899.949547] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 [68899.949555] pending: do_work [dm_mod] [68899.949568] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=95s workers=4 idle: 27130 27223 1084 Fix this by splitting the complete_jobs list into two parts: A user facing part, named callback_jobs, and one used internally by kcopyd, retaining the name complete_jobs. dm_kcopyd_do_callback() and dispatch_job() now push their jobs to the callback_jobs list, which is spliced to the complete_jobs list once, every time the kcopyd thread wakes up. This prevents kcopyd from hogging the CPU indefinitely and causing workqueue stalls. Re-running the aforementioned test: * Workqueue stalls are eliminated * The maximum writing time among all targets is reduced from 09m37.10s to 06m04.85s and the total run time of the test is reduced from 10m43.591s to 7m19.199s [1] https://github.com/jthornber/device-mapper-test-suiteSigned-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Arnaldo Carvalho de Melo authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit bd8d57fb ] The strncpy() function may leave the destination string buffer unterminated, better use strlcpy() that we have a __weak fallback implementation for systems without it. This fixes this warning on an Alpine Linux Edge system with gcc 8.2: util/parse-events.c: In function 'print_symbol_events': util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation] strncpy(name, syms->symbol, MAX_NAME_LEN); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function 'print_symbol_events.constprop', inlined from 'print_events' at util/parse-events.c:2508:2: util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation] strncpy(name, syms->symbol, MAX_NAME_LEN); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In function 'print_symbol_events.constprop', inlined from 'print_events' at util/parse-events.c:2511:2: util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation] strncpy(name, syms->symbol, MAX_NAME_LEN); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 947b4ad1 ("perf list: Fix max event string size") Link: https://lkml.kernel.org/n/tip-b663e33bm6x8hrkie4uxh7u2@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Arnaldo Carvalho de Melo authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 2f530253 ] The strncpy() function may leave the destination string buffer unterminated, better use strlcpy() that we have a __weak fallback implementation for systems without it. In this specific case this would only happen if fgets() was buggy, as its man page states that it should read one less byte than the size of the destination buffer, so that it can put the nul byte at the end of it, so it would never copy 255 non-nul chars, as fgets reads into the orig buffer at most 254 non-nul chars and terminates it. But lets just switch to strlcpy to keep the original intent and silence the gcc 8.2 warning. This fixes this warning on an Alpine Linux Edge system with gcc 8.2: In function 'cpu_model', inlined from 'svg_cpu_box' at util/svghelper.c:378:2: util/svghelper.c:337:5: error: 'strncpy' output may be truncated copying 255 bytes from a string of length 255 [-Werror=stringop-truncation] strncpy(cpu_m, &buf[13], 255); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Fixes: f48d55ce ("perf: Add a SVG helper library file") Link: https://lkml.kernel.org/n/tip-xzkoo0gyr56gej39ltivuh9g@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Adrian Hunter authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 1c6f709b ] Users should never use 'pt=0', but if they do it may give a meaningless error: $ perf record -e intel_pt/pt=0/u uname Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u). Fix that by forcing 'pt=1'. Committer testing: # perf record -e intel_pt/pt=0/u uname Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u). /bin/dmesg | grep -i perf may provide additional information. # perf record -e intel_pt/pt=0/u uname pt=0 doesn't make sense, forcing pt=1 Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data ] # Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/b7c5b4e5-9497-10e5-fd43-5f3e4a0fe51d@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Jonas Danielsson authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit ae460c11 ] On our AT91SAM9260 board we use the same sdio bus for wifi and for the sd card slot. This caused the atmel-mci to give the following splat on the serial console: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 538 at drivers/mmc/host/atmel-mci.c:859 atmci_send_command+0x24/0x44 Modules linked in: CPU: 0 PID: 538 Comm: mmcqd/0 Not tainted 4.14.76 #14 Hardware name: Atmel AT91SAM9 [<c000fccc>] (unwind_backtrace) from [<c000d3dc>] (show_stack+0x10/0x14) [<c000d3dc>] (show_stack) from [<c0017644>] (__warn+0xd8/0xf4) [<c0017644>] (__warn) from [<c0017704>] (warn_slowpath_null+0x1c/0x24) [<c0017704>] (warn_slowpath_null) from [<c033bb9c>] (atmci_send_command+0x24/0x44) [<c033bb9c>] (atmci_send_command) from [<c033e984>] (atmci_start_request+0x1f4/0x2dc) [<c033e984>] (atmci_start_request) from [<c033f3b4>] (atmci_request+0xf0/0x164) [<c033f3b4>] (atmci_request) from [<c0327108>] (mmc_start_request+0x280/0x2d0) [<c0327108>] (mmc_start_request) from [<c032800c>] (mmc_start_areq+0x230/0x330) [<c032800c>] (mmc_start_areq) from [<c03366f8>] (mmc_blk_issue_rw_rq+0xc4/0x310) [<c03366f8>] (mmc_blk_issue_rw_rq) from [<c03372c4>] (mmc_blk_issue_rq+0x118/0x5ac) [<c03372c4>] (mmc_blk_issue_rq) from [<c033781c>] (mmc_queue_thread+0xc4/0x118) [<c033781c>] (mmc_queue_thread) from [<c002daf8>] (kthread+0x100/0x118) [<c002daf8>] (kthread) from [<c000a580>] (ret_from_fork+0x14/0x34) ---[ end trace 594371ddfa284bd6 ]--- This is: WARN_ON(host->cmd); This was fixed on our board by letting atmci_request_end determine what state we are in. Instead of unconditionally setting it to STATE_IDLE on STATE_END_REQUEST. Signed-off-by: Jonas Danielsson <jonas@orbital-systems.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Masahiro Yamada authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit fbac5977 ] An unterminated string literal followed by new line is passed to the parser (with "multi-line strings not supported" warning shown), then handled properly there. On the other hand, an unterminated string literal at end of file is never passed to the parser, then results in memory leak. [Test Code] ----------(Kconfig begin)---------- source "Kconfig.inc" config A bool "a" -----------(Kconfig end)----------- --------(Kconfig.inc begin)-------- config B bool "b\No new line at end of file ---------(Kconfig.inc end)--------- [Summary from Valgrind] Before the fix: LEAK SUMMARY: definitely lost: 16 bytes in 1 blocks ... After the fix: LEAK SUMMARY: definitely lost: 0 bytes in 0 blocks ... Eliminate the memory leak path by handling this case. Of course, such a Kconfig file is wrong already, so I will add an error message later. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Masahiro Yamada authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 77c1c0fa ] Currently, warn_ignore_character() displays invalid file name and line number. The lexer should use current_file->name and yylineno, while the parser should use zconf_curname() and zconf_lineno(). This difference comes from that the lexer is always going ahead of the parser. The parser needs to look ahead one token to make a shift/reduce decision, so the lexer is requested to scan more text from the input file. This commit fixes the warning message from warn_ignored_character(). [Test Code] ----(Kconfig begin)---- / -----(Kconfig end)----- [Output] Before the fix: <none>:0:warning: ignoring unsupported character '/' After the fix: Kconfig:1:warning: ignoring unsupported character '/' Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Lucas Stach authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit f7542d81 ] The exclusive gates may be set up in the wrong way by software running before the clock driver comes up. In that case the exclusive setup is locked in its initial state, as the complementary function can't be activated without disabling the initial setup first. To avoid this lock situation, reset the exclusive gates to the off state and allow the kernel to provide the proper setup. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Dong Aisheng <Aisheng.dong@nxp.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
David Disseldorp authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 0de26357 ] spc5r17.pdf specifies: 4.3.1 ASCII data field requirements ASCII data fields shall contain only ASCII printable characters (i.e., code values 20h to 7Eh) and may be terminated with one or more ASCII null (00h) characters. ASCII data fields described as being left-aligned shall have any unused bytes at the end of the field (i.e., highest offset) and the unused bytes shall be filled with ASCII space characters (20h). LIO currently space-pads the T10 VENDOR IDENTIFICATION and PRODUCT IDENTIFICATION fields in the standard INQUIRY data. However, the PRODUCT REVISION LEVEL field in the standard INQUIRY data as well as the T10 VENDOR IDENTIFICATION field in the INQUIRY Device Identification VPD Page are zero-terminated/zero-padded. Fix this inconsistency by using space-padding for all of the above fields. Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bryant G. Ly <bly@catalogicsoftware.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
yupeng authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 0fbe82e6 ] after set SO_DONTROUTE to 1, the IP layer should not route packets if the dest IP address is not in link scope. But if the socket has cached the dst_entry, such packets would be routed until the sk_dst_cache expires. So we should clean the sk_dst_cache when a user set SO_DONTROUTE option. Below are server/client python scripts which could reprodue this issue: server side code: ========================================================================== import socket import struct import time s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind(('0.0.0.0', 9000)) s.listen(1) sock, addr = s.accept() sock.setsockopt(socket.SOL_SOCKET, socket.SO_DONTROUTE, struct.pack('i', 1)) while True: sock.send(b'foo') time.sleep(1) ========================================================================== client side code: ========================================================================== import socket import time s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(('server_address', 9000)) while True: data = s.recv(1024) print(data) ========================================================================== Signed-off-by: yupeng <yupeng0921@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Nathan Chancellor authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit b2e9a4ed ] Clang warns: drivers/media/firewire/firedtv-avc.c:999:45: warning: implicit conversion from 'int' to 'char' changes value from 159 to -97 [-Wconstant-conversion] app_info[0] = (EN50221_TAG_APP_INFO >> 16) & 0xff; ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ drivers/media/firewire/firedtv-avc.c:1000:45: warning: implicit conversion from 'int' to 'char' changes value from 128 to -128 [-Wconstant-conversion] app_info[1] = (EN50221_TAG_APP_INFO >> 8) & 0xff; ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ drivers/media/firewire/firedtv-avc.c:1040:44: warning: implicit conversion from 'int' to 'char' changes value from 159 to -97 [-Wconstant-conversion] app_info[0] = (EN50221_TAG_CA_INFO >> 16) & 0xff; ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ drivers/media/firewire/firedtv-avc.c:1041:44: warning: implicit conversion from 'int' to 'char' changes value from 128 to -128 [-Wconstant-conversion] app_info[1] = (EN50221_TAG_CA_INFO >> 8) & 0xff; ~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ 4 warnings generated. Change app_info's type to unsigned char to match the type of the member msg in struct ca_msg, which is the only thing passed into the app_info parameter in this function. Link: https://github.com/ClangBuiltLinux/linux/issues/105Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Breno Leitao authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 2b038cbc ] When booting a pseries kernel with PREEMPT enabled, it dumps the following warning: BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1 caller is pseries_processor_idle_init+0x5c/0x22c CPU: 13 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc3-00090-g12201a0128bc-dirty #828 Call Trace: [c000000429437ab0] [c0000000009c8878] dump_stack+0xec/0x164 (unreliable) [c000000429437b00] [c0000000005f2f24] check_preemption_disabled+0x154/0x160 [c000000429437b90] [c000000000cab8e8] pseries_processor_idle_init+0x5c/0x22c [c000000429437c10] [c000000000010ed4] do_one_initcall+0x64/0x300 [c000000429437ce0] [c000000000c54500] kernel_init_freeable+0x3f0/0x500 [c000000429437db0] [c0000000000112dc] kernel_init+0x2c/0x160 [c000000429437e20] [c00000000000c1d0] ret_from_kernel_thread+0x5c/0x6c This happens because the code calls get_lppaca() which calls get_paca() and it checks if preemption is disabled through check_preemption_disabled(). Preemption should be disabled because the per CPU variable may make no sense if there is a preemption (and a CPU switch) after it reads the per CPU data and when it is used. In this device driver specifically, it is not a problem, because this code just needs to have access to one lppaca struct, and it does not matter if it is the current per CPU lppaca struct or not (i.e. when there is a preemption and a CPU migration). That said, the most appropriate fix seems to be related to avoiding the debug_smp_processor_id() call at get_paca(), instead of calling preempt_disable() before get_paca(). Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Joel Fernandes (Google) authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 30696378 ] The ramoops backend currently calls persistent_ram_save_old() even if a buffer is empty. While this appears to work, it is does not seem like the right thing to do and could lead to future bugs so lets avoid that. It also prevents misleading prints in the logs which claim the buffer is valid. I got something like: found existing buffer, size 0, start 0 When I was expecting: no valid data in buffer (sig = ...) This bails out early (and reports with pr_debug()), since it's an acceptable state. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Daniel Santos authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit a788c527 ] jffs2_sync_fs makes the assumption that if CONFIG_JFFS2_FS_WRITEBUFFER is defined then a write buffer is available and has been initialized. However, this does is not the case when the mtd device has no out-of-band buffer: int jffs2_nand_flash_setup(struct jffs2_sb_info *c) { if (!c->mtd->oobsize) return 0; ... The resulting call to cancel_delayed_work_sync passing a uninitialized (but zeroed) delayed_work struct forces lockdep to become disabled. [ 90.050639] overlayfs: upper fs does not support tmpfile. [ 90.652264] INFO: trying to register non-static key. [ 90.662171] the code is fine but needs lockdep annotation. [ 90.673090] turning off the locking correctness validator. [ 90.684021] CPU: 0 PID: 1762 Comm: mount_root Not tainted 4.14.63 #0 [ 90.696672] Stack : 00000000 00000000 80d8f6a2 00000038 805f0000 80444600 8fe364f4 805dfbe7 [ 90.713349] 80563a30 000006e2 8068370c 00000001 00000000 00000001 8e2fdc48 ffffffff [ 90.730020] 00000000 00000000 80d90000 00000000 00000106 00000000 6465746e 312e3420 [ 90.746690] 6b636f6c 03bf0000 f8000000 20676e69 00000000 80000000 00000000 8e2c2a90 [ 90.763362] 80d90000 00000001 00000000 8e2c2a90 00000003 80260dc0 08052098 80680000 [ 90.780033] ... [ 90.784902] Call Trace: [ 90.789793] [<8000f0d8>] show_stack+0xb8/0x148 [ 90.798659] [<8005a000>] register_lock_class+0x270/0x55c [ 90.809247] [<8005cb64>] __lock_acquire+0x13c/0xf7c [ 90.818964] [<8005e314>] lock_acquire+0x194/0x1dc [ 90.828345] [<8003f27c>] flush_work+0x200/0x24c [ 90.837374] [<80041dfc>] __cancel_work_timer+0x158/0x210 [ 90.847958] [<801a8770>] jffs2_sync_fs+0x20/0x54 [ 90.857173] [<80125cf4>] iterate_supers+0xf4/0x120 [ 90.866729] [<80158fc4>] sys_sync+0x44/0x9c [ 90.875067] [<80014424>] syscall_common+0x34/0x58 Signed-off-by: Daniel Santos <daniel.santos@pobox.com> Reviewed-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Anders Roxell authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 81e9fa8b ] The armv8_pmuv3 driver doesn't have a remove function, and when the test 'CONFIG_DEBUG_TEST_DRIVER_REMOVE=y' is enabled, the following Call trace can be seen. [ 1.424287] Failed to register pmu: armv8_pmuv3, reason -17 [ 1.424870] WARNING: CPU: 0 PID: 1 at ../kernel/events/core.c:11771 perf_event_sysfs_init+0x98/0xdc [ 1.425220] Modules linked in: [ 1.425531] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.19.0-rc7-next-20181012-00003-ge7a97b1ad77b-dirty #35 [ 1.425951] Hardware name: linux,dummy-virt (DT) [ 1.426212] pstate: 80000005 (Nzcv daif -PAN -UAO) [ 1.426458] pc : perf_event_sysfs_init+0x98/0xdc [ 1.426720] lr : perf_event_sysfs_init+0x98/0xdc [ 1.426908] sp : ffff00000804bd50 [ 1.427077] x29: ffff00000804bd50 x28: ffff00000934e078 [ 1.427429] x27: ffff000009546000 x26: 0000000000000007 [ 1.427757] x25: ffff000009280710 x24: 00000000ffffffef [ 1.428086] x23: ffff000009408000 x22: 0000000000000000 [ 1.428415] x21: ffff000009136008 x20: ffff000009408730 [ 1.428744] x19: ffff80007b20b400 x18: 000000000000000a [ 1.429075] x17: 0000000000000000 x16: 0000000000000000 [ 1.429418] x15: 0000000000000400 x14: 2e79726f74636572 [ 1.429748] x13: 696420656d617320 x12: 656874206e692065 [ 1.430060] x11: 6d616e20656d6173 x10: 2065687420687469 [ 1.430335] x9 : ffff00000804bd50 x8 : 206e6f7361657220 [ 1.430610] x7 : 2c3376756d705f38 x6 : ffff00000954d7ce [ 1.430880] x5 : 0000000000000000 x4 : 0000000000000000 [ 1.431226] x3 : 0000000000000000 x2 : ffffffffffffffff [ 1.431554] x1 : 4d151327adc50b00 x0 : 0000000000000000 [ 1.431868] Call trace: [ 1.432102] perf_event_sysfs_init+0x98/0xdc [ 1.432382] do_one_initcall+0x6c/0x1a8 [ 1.432637] kernel_init_freeable+0x1bc/0x280 [ 1.432905] kernel_init+0x18/0x160 [ 1.433115] ret_from_fork+0x10/0x18 [ 1.433297] ---[ end trace 27fd415390eb9883 ]--- Rework to set suppress_bind_attrs flag to avoid removing the device when CONFIG_DEBUG_TEST_DRIVER_REMOVE=y, since there's no real reason to remove the armv8_pmuv3 driver. Cc: Arnd Bergmann <arnd@arndb.de> Co-developed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Maciej W. Rozycki authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit e4849aff ] The Broadcom SiByte BCM1250, BCM1125, and BCM1125H SOCs have an onchip DRAM controller that supports memory amounts of up to 16GiB, and due to how the address decoder has been wired in the SOC any memory beyond 1GiB is actually mapped starting from 4GiB physical up, that is beyond the 32-bit addressable limit[1]. Consequently if the maximum amount of memory has been installed, then it will span up to 19GiB. Many of the evaluation boards we support that are based on one of these SOCs have their memory soldered and the amount present fits in the 32-bit address range. The BCM91250A SWARM board however has actual DIMM slots and accepts, depending on the peripherals revision of the SOC, up to 4GiB or 8GiB of memory in commercially available JEDEC modules[2]. I believe this is also the case with the BCM91250C2 LittleSur board. This means that up to either 3GiB or 7GiB of memory requires 64-bit addressing to access. I believe the BCM91480B BigSur board, which has the BCM1480 SOC instead, accepts at least as much memory, although I have no documentation or actual hardware available to verify that. Both systems have PCI slots installed for use by any PCI option boards, including ones that only support 32-bit addressing (additionally the 32-bit PCI host bridge of the BCM1250, BCM1125, and BCM1125H SOCs limits addressing to 32-bits), and there is no IOMMU available. Therefore for PCI DMA to work in the presence of memory beyond enable swiotlb for the affected systems. All the other SOC onchip DMA devices use 40-bit addressing and therefore can address the whole memory, so only enable swiotlb if PCI support and support for DMA beyond 4GiB have been both enabled in the configuration of the kernel. This shows up as follows: Broadcom SiByte BCM1250 B2 @ 800 MHz (SB1 rev 2) Board type: SiByte BCM91250A (SWARM) Determined physical RAM map: memory: 000000000fe7fe00 @ 0000000000000000 (usable) memory: 000000001ffffe00 @ 0000000080000000 (usable) memory: 000000000ffffe00 @ 00000000c0000000 (usable) memory: 0000000087fffe00 @ 0000000100000000 (usable) software IO TLB: mapped [mem 0xcbffc000-0xcfffc000] (64MB) in the bootstrap log and removes failures like these: defxx 0000:02:00.0: dma_direct_map_page: overflow 0x0000000185bc6080+4608 of device mask ffffffff bus mask 0 fddi0: Receive buffer allocation failed fddi0: Adapter open failed! IP-Config: Failed to open fddi0 defxx 0000:09:08.0: dma_direct_map_page: overflow 0x0000000185bc6080+4608 of device mask ffffffff bus mask 0 fddi1: Receive buffer allocation failed fddi1: Adapter open failed! IP-Config: Failed to open fddi1 when memory beyond 4GiB is handed out to devices that can only do 32-bit addressing. This updates commit cce335ae ("[MIPS] 64-bit Sibyte kernels need DMA32."). References: [1] "BCM1250/BCM1125/BCM1125H User Manual", Revision 1250_1125-UM100-R, Broadcom Corporation, 21 Oct 2002, Section 3: "System Overview", "Memory Map", pp. 34-38 [2] "BCM91250A User Manual", Revision 91250A-UM100-R, Broadcom Corporation, 18 May 2004, Section 3: "Physical Description", "Supported DRAM", p. 23 Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> [paul.burton@mips.com: Remove GPL text from dma.c; SPDX tag covers it] Signed-off-by: Paul Burton <paul.burton@mips.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Patchwork: https://patchwork.linux-mips.org/patch/21108/ References: cce335ae ("[MIPS] 64-bit Sibyte kernels need DMA32.") Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Anders Roxell authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 347a28b5 ] This happened while running in qemu-system-aarch64, the AMBA PL011 UART driver when enabling CONFIG_DEBUG_TEST_DRIVER_REMOVE. arch_initcall(pl011_init) came before subsys_initcall(default_bdi_init), devtmpfs' handle_remove() crashes because the reference count is a NULL pointer only because wb->bdi hasn't been initialized yet. Rework so that wb_put have an extra check if wb->bdi before decrement wb->refcnt and also add a WARN_ON_ONCE to get a warning if it happens again in other drivers. Fixes: 52ebea74 ("writeback: make backing_dev_info host cgroup-specific bdi_writebacks") Co-developed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
Miroslav Lichvar authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit e1f65b0d ] It seems with some NICs supported by the e1000e driver a SYSTIM reading may occasionally be few microseconds before the previous reading and if enabled also pass e1000e_sanitize_systim() without reaching the maximum number of rereads, even if the function is modified to check three consecutive readings (i.e. it doesn't look like a double read error). This causes an underflow in the timecounter and the PHC time jumps hours ahead. This was observed on 82574, I217 and I219. The fastest way to reproduce it is to run a program that continuously calls the PTP_SYS_OFFSET ioctl on the PHC. Modify e1000e_phc_gettime() to use timecounter_cyc2time() instead of timecounter_read() in order to allow non-monotonic SYSTIM readings and prevent the PHC from jumping. Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
João Paulo Rechi Vita authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit 78f3ac76 ] In the past, Asus firmwares would change the panel backlight directly through the EC when the display off hotkey (Fn+F7) was pressed, and only notify the OS of such change, with 0x33 when the LCD was ON and 0x34 when the LCD was OFF. These are currently mapped to KEY_DISPLAYTOGGLE and KEY_DISPLAY_OFF, respectively. Most recently the EC on Asus most machines lost ability to toggle the LCD backlight directly, but unless the OS informs the firmware it is going to handle the display toggle hotkey events, the firmware still tries change the brightness through the EC, to no effect. The end result is a long list (at Endless we counted 11) of Asus laptop models where the display toggle hotkey does not perform any action. Our firmware engineers contacts at Asus were surprised that there were still machines out there with the old behavior. Calling WMNB(ASUS_WMI_DEVID_BACKLIGHT==0x00050011, 2) on the _WDG device tells the firmware that it should let the OS handle the display toggle event, in which case it will simply notify the OS of a key press with 0x35, as shown by the DSDT excerpts bellow. Scope (_SB) { (...) Device (ATKD) { (...) Name (_WDG, Buffer (0x28) { /* 0000 */ 0xD0, 0x5E, 0x84, 0x97, 0x6D, 0x4E, 0xDE, 0x11, /* 0008 */ 0x8A, 0x39, 0x08, 0x00, 0x20, 0x0C, 0x9A, 0x66, /* 0010 */ 0x4E, 0x42, 0x01, 0x02, 0x35, 0xBB, 0x3C, 0x0B, /* 0018 */ 0xC2, 0xE3, 0xED, 0x45, 0x91, 0xC2, 0x4C, 0x5A, /* 0020 */ 0x6D, 0x19, 0x5D, 0x1C, 0xFF, 0x00, 0x01, 0x08 }) Method (WMNB, 3, Serialized) { CreateDWordField (Arg2, Zero, IIA0) CreateDWordField (Arg2, 0x04, IIA1) Local0 = (Arg1 & 0xFFFFFFFF) (...) If ((Local0 == 0x53564544)) { (...) If ((IIA0 == 0x00050011)) { If ((IIA1 == 0x02)) { ^^PCI0.SBRG.EC0.SPIN (0x72, One) ^^PCI0.SBRG.EC0.BLCT = One } Return (One) } } (...) } (...) } (...) } (...) Scope (_SB.PCI0.SBRG.EC0) { (...) Name (BLCT, Zero) (...) Method (_Q10, 0, NotSerialized) // _Qxx: EC Query { If ((BLCT == Zero)) { Local0 = One Local0 = RPIN (0x72) Local0 ^= One SPIN (0x72, Local0) If (ATKP) { Local0 = (0x34 - Local0) ^^^^ATKD.IANE (Local0) } } ElseIf ((BLCT == One)) { If (ATKP) { ^^^^ATKD.IANE (0x35) } } } (...) } Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-
David Ahern authored
BugLink: https://bugs.launchpad.net/bugs/1818797 [ Upstream commit d4a7e9bb ] I realized the last patch calls dev_get_by_index_rcu in a branch not holding the rcu lock. Add the calls to rcu_read_lock and rcu_read_unlock. Fixes: ec90ad33 ("ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Juerg Haefliger <juergh@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
-