- 07 Mar, 2014 12 commits
-
-
David S. Miller authored
Zoltan Kiss says: ==================== xen-netback: TX grant mapping with SKBTX_DEV_ZEROCOPY instead of copy A long known problem of the upstream netback implementation that on the TX path (from guest to Dom0) it copies the whole packet from guest memory into Dom0. That simply became a bottleneck with 10Gb NICs, and generally it's a huge perfomance penalty. The classic kernel version of netback used grant mapping, and to get notified when the page can be unmapped, it used page destructors. Unfortunately that destructor is not an upstreamable solution. Ian Campbell's skb fragment destructor patch series [1] tried to solve this problem, however it seems to be very invasive on the network stack's code, and therefore haven't progressed very well. This patch series use SKBTX_DEV_ZEROCOPY flags to tell the stack it needs to know when the skb is freed up. That is the way KVM solved the same problem, and based on my initial tests it can do the same for us. Avoiding the extra copy boosted up TX throughput from 6.8 Gbps to 7.9 (I used a slower AMD Interlagos box, both Dom0 and guest on upstream kernel, on the same NUMA node, running iperf 2.0.5, and the remote end was a bare metal box on the same 10Gb switch) Based on my investigations the packet get only copied if it is delivered to Dom0 IP stack through deliver_skb, which is due to this [2] patch. This affects DomU->Dom0 IP traffic and when Dom0 does routing/NAT for the guest. That's a bit unfortunate, but luckily it doesn't cause a major regression for this usecase. In the future we should try to eliminate that copy somehow. There are a few spinoff tasks which will be addressed in separate patches: - grant copy the header directly instead of map and memcpy. This should help us avoiding TLB flushing - use something else than ballooned pages - fix grant map to use page->index properly I've tried to broke it down to smaller patches, with mixed results, so I welcome suggestions on that part as well: 1: Use skb->cb to store pending_idx 2: Some refactoring 3: Change RX path for mapped SKB fragments (moved here to keep bisectability, review it after #4) 4: Introduce TX grant mapping 5: Remove old TX grant copy definitons and fix indentations 6: Add stat counters for zerocopy 7: Handle guests with too many frags 8: Timeout packets in RX path 9: Aggregate TX unmap operations v2: I've fixed some smaller things, see the individual patches. I've added a few new stat counters, and handling the important use case when an older guest sends lots of slots. Instead of delayed copy now we timeout packets on the RX path, based on the assumption that otherwise packets should get stucked anywhere else. Finally some unmap batching to avoid too much TLB flush v3: Apart from fixing a few things mentioned in responses the important change is the use the hypercall directly for grant [un]mapping, therefore we can avoid m2p override. v4: Now we are using a new grant mapping API to avoid m2p_override. The RX queue timeout logic changed also. v5: Only minor fixes based on Wei's comments v6: Important bugfixes for xenvif_poll exit path and zerocopy callback, see first 2 patches. Also rework of handling packets with too many slots, and reorder the series a bit. v7: Small fixes in comments/log messages/error paths, and merging the frag overflow stats patch into its parent. [1] http://lwn.net/Articles/491522/ [2] https://lkml.org/lkml/2012/7/20/363 ==================== Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zoltan Kiss authored
Unmapping causes TLB flushing, therefore we should make it in the largest possible batches. However we shouldn't starve the guest for too long. So if the guest has space for at least two big packets and we don't have at least a quarter ring to unmap, delay it for at most 1 milisec. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zoltan Kiss authored
A malicious or buggy guest can leave its queue filled indefinitely, in which case qdisc start to queue packets for that VIF. If those packets came from an another guest, it can block its slots and prevent shutdown. To avoid that, we make sure the queue is drained in every 10 seconds. The QDisc queue in worst case takes 3 round to flush usually. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zoltan Kiss authored
Xen network protocol had implicit dependency on MAX_SKB_FRAGS. Netback has to handle guests sending up to XEN_NETBK_LEGACY_SLOTS_MAX slots. To achieve that: - create a new skb - map the leftover slots to its frags (no linear buffer here!) - chain it to the previous through skb_shinfo(skb)->frag_list - map them - copy and coalesce the frags into a brand new one and send it to the stack - unmap the 2 old skb's pages It's also introduces new stat counters, which help determine how often the guest sends a packet with more than MAX_SKB_FRAGS frags. NOTE: if bisect brought you here, you should apply the series up until "xen-netback: Timeout packets in RX path", otherwise malicious guests can block other guests by not releasing their sent packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zoltan Kiss authored
These counters help determine how often the buffers had to be copied. Also they help find out if packets are leaked, as if "sent != success + fail", there are probably packets never freed up properly. NOTE: if bisect brought you here, you should apply the series up until "xen-netback: Timeout packets in RX path", otherwise Windows guests can't work properly and malicious guests can block other guests by not releasing their sent packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zoltan Kiss authored
These became obsolete with grant mapping. I've left intentionally the indentations in this way, to improve readability of previous patches. NOTE: if bisect brought you here, you should apply the series up until "xen-netback: Timeout packets in RX path", otherwise Windows guests can't work properly and malicious guests can block other guests by not releasing their sent packets. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zoltan Kiss authored
This patch introduces grant mapping on netback TX path. It replaces grant copy operations, ditching grant copy coalescing along the way. Another solution for copy coalescing is introduced in "xen-netback: Handle guests with too many frags", older guests and Windows can broke before that patch applies. There is a callback (xenvif_zerocopy_callback) from core stack to release the slots back to the guests when kfree_skb or skb_orphan_frags called. It feeds a separate dealloc thread, as scheduling NAPI instance from there is inefficient, therefore we can't do dealloc from the instance. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zoltan Kiss authored
RX path need to know if the SKB fragments are stored on pages from another domain. Logically this patch should be after introducing the grant mapping itself, as it makes sense only after that. But to keep bisectability, I moved it here. It shouldn't change any functionality here. xenvif_zerocopy_callback and ubuf_to_vif are just stubs here, they will be introduced properly later on. Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zoltan Kiss authored
This patch contains a few bits of refactoring before introducing the grant mapping changes: - introducing xenvif_tx_pending_slots_available(), as this is used several times, and will be used more often - rename the thread to vifX.Y-guest-rx, to signify it does RX work from the guest point of view Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zoltan Kiss authored
Storing the pending_idx at the first byte of the linear buffer never looked good, skb->cb is a more proper place for this. It also prevents the header to be directly grant copied there, and we don't have the pending_idx after we copied the header here, so it's time to change it. It also introduces helpers for the RX side Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
There is no reason to orphan skb in l2tp. This breaks things like per socket memory limits, TCP Small queues... Fix this before more people copy/paste it. This is very similar to commit 8f646c92 ("vxlan: keep original skb ownership") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Usage of skb->tstamp should remain private to TCP stack (only set on packets on write queue, not on cloned ones) Otherwise, packets given to loopback interface with a non null tstamp can confuse netif_rx() / net_timestamp_check() Other possibility would be to clear tstamp in loopback_xmit(), as done in skb_scrub_packet() Fixes: 740b0f18 ("tcp: switch rtt estimations to usec resolution") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 06 Mar, 2014 17 commits
-
-
stephen hemminger authored
This is a fixup patch to resolve issues with const from my earlier patch. Make all the setter functions use const on input parameter. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
htb_dump() and htb_dump_class() do not strictly need to acquire qdisc lock to fetch qdisc and/or class parameters. We hold RTNL and no changes can occur. This reduces by 50% qdisc lock pressure while doing tc qdisc|class dump operations. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Alexander Aring says: ==================== 6lowpan: header cleanup this patch series fix a missing include of 6LoWPAN header and move it into the include/net directory. Since we did some code sharing with bluetooth 6LoWPAN the header turns into a generic header for 6LoWPAN. Instead to use a relative path in bluetooth 6LoWPAN we can now use include <net/6lowpan.h>. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
This header is used by bluetooth and ieee802154 branch. This patch move this header to the include/net directory to avoid a use of a relative path in include. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
The 6lowpan.h file contains some static inline function which use internal ipv6 api structs. Add a include of ipv6.h to be sure that it's known before. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Vecera authored
v2: remove unnecessary braces from all 'loopback' if-blocks (thx Sergei) Cc: sathya.perla@emulex.com Cc: subbu.seetharaman@emulex.com Cc: ajit.khaparde@emulex.com Cc: sergei.shtylyov@cogentembedded.com Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Fengguang Wu authored
Fix static error introduced by commit: b97b33a3 [645/653] net/mlx4_en: Verify mlx4_en module parameters sparse warnings: drivers/net/ethernet/mellanox/mlx4/en_main.c:335:6: sparse: symbol 'mlx4_en_verify_params' was not declared. Should it be static? CC: netdev@vger.kernel.org CC: linux-kernel@vger.kernel.org CC: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
Make local functions static (ie. only used in bond_options.c) Make bond options parsing tables constant. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
These functions are defined but no longer used. Compile tested only. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Can be invoked from non-BH context. Based upon a patch by Eric Dumazet. Fixes: f19c29e3 ("tcp: snmp stats for Fast Open, SYN rtx, and data pkts") Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Veaceslav Falico authored
Currently we're using GFP_KERNEL, however there are some path(s) where we can hold some spinlocks, specifically bond->curr_slave_lock: [ 4.722916] BUG: sleeping function called from invalid context at mm/slub.c:965 [ 4.724438] in_atomic(): 1, irqs_disabled(): 0, pid: 940, name: ifup-eth [ 4.726034] 5 locks held by ifup-eth/940: ...snip... [ 4.734646] #4: (&bond->curr_slave_lock){+...+.}, at: [<ffffffffa00badc6>] bond_enslave+0xda6/0xdd0 [bonding] ...snip... [ 4.759081] [<ffffffffa00b6f11>] bond_change_active_slave+0x191/0x3b0 [bonding] [ 4.760917] [<ffffffffa00b7227>] bond_select_active_slave+0xf7/0x1d0 [bonding] [ 4.762751] [<ffffffffa00badce>] bond_enslave+0xdae/0xdd0 [bonding] ...snip... As it's out of hot path and is a really rare event - change the gfp_t flags to GFP_ATOMIC to avoid sleeping under spinlock. v2: convert new notify calls to GFP_ATOMIC. CC: Thomas Glanzmann <thomas@glanzmann.de> CC: Ding Tianhong <dingtianhong@huawei.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hannes Frederic Sowa authored
Commit e688a604 ("net: introduce DST_NOPEER dst flag") introduced DST_NOPEER because because of crashes in ipv6_select_ident called from udp6_ufo_fragment. Since commit 916e4cf4 ("ipv6: reuse ip6_frag_id from ip6_ufo_append_data") we don't call ipv6_select_ident any more from ip6_ufo_append_data, thus this flag lost its purpose and can be removed. Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Hayes Wang says: ==================== r8152: cleanups Deal with some empty lines and spaces, replace some tp->netdev with netdev, and remove the unnecessary function. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
hayeswang authored
The rtl8152_get_stats() returns the point address of the struct net_device_stats. This could be got from struct net_device directly. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
hayeswang authored
Replace some tp->netdev with netdev. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
hayeswang authored
Add or remove some empty lines. Replace the spaces with the tabs. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller authored
Conflicts: drivers/net/wireless/ath/ath9k/recv.c drivers/net/wireless/mwifiex/pcie.c net/ipv6/sit.c The SIT driver conflict consists of a bug fix being done by hand in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper was created (netdev_alloc_pcpu_stats()) which takes care of this. The two wireless conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 05 Mar, 2014 2 commits
-
-
Alexander Aring authored
This patch fixes some whitespace issues in Kconfig files of IEEE 802.15.4 subsytem. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
Since commit 8fad346f (ieee802154: add basic support for RF212 to at86rf230 driver) we support at86rf212 as well. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 04 Mar, 2014 9 commits
-
-
Sathya Perla authored
The driver currently maps a page for DMA, divides the page into multiple frags and posts them to the HW. It un-maps the page after data is received on all the frags of the page. This scheme doesn't work when bounce buffers are used for DMA (swiotlb=force kernel param). This patch fixes this problem by calling dma_sync_single_for_cpu() for each frag (excepting the last one) so that the data is copied from the bounce buffers. The page is un-mapped only when DMA finishes on the last frag of the page. (Thanks Ben H. for suggesting the dma_sync API!) This patch also renames the "last_page_user" field of be_rx_page_info{} struct to "last_frag" to improve readability of the fixed code. Reported-by: Li Fengmao <li.fengmao@zte.com.cn> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Simon Wunderlich says: ==================== this series contains a header file proposal for MPLS labels. These labels do not seem to be properly defined in the kernel so far. We are developing a wired/wireless 802.21/MPLS switch and need to check the MPLS labels to use the traffic control info for transmissions over 802.11 networks. Changes to third version: * rename mpls_label_stack to mpls_label (thanks Neil) * fix over-indendented closing brac (thanks Sergei) * add Johannes' Ack ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Simon Wunderlich authored
MPLS labels may contain traffic control information, which should be evaluated and used by the wireless subsystem if present. Also check for IEEE 802.21 which is always network control traffic. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Simon Wunderlich authored
Labels for the Multiprotocol Label Switching are defined in RFC 3032 which was superseded by RFC 5462. Add the definition to UAPI and a stub header for include/linux. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Simon Wunderlich authored
Add the Ethertype for IEEE Std 802.21 - Media Independent Handover Protocol. This Ethertype is used for network control messages. Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Fix memory leak in ieee80211_prep_connection(), sta_info leaked on error. From Eytan Lifshitz. 2) Unintentional switch case fallthrough in nft_reject_inet_eval(), from Patrick McHardy. 3) Must check if payload lenth is a power of 2 in nft_payload_select_ops(), from Nikolay Aleksandrov. 4) Fix mis-checksumming in xen-netfront driver, ip_hdr() is not in the correct place when we invoke skb_checksum_setup(). From Wei Liu. 5) TUN driver should not advertise HW vlan offload features in vlan_features. Fix from Fernando Luis Vazquez Cao. 6) IPV6_VTI needs to select NET_IPV_TUNNEL to avoid build errors, fix from Steffen Klassert. 7) Add missing locking in xfrm_migrade_state_find(), we must hold the per-namespace xfrm_state_lock while traversing the lists. Fix from Steffen Klassert. 8) Missing locking in ath9k driver, access to tid->sched must be done under ath_txq_lock(). Fix from Stanislaw Gruszka. 9) Fix two bugs in TCP fastopen. First respect the size argument given to tcp_sendmsg() in the fastopen path, and secondly prevent tcp_send_syn_data() from potentially using order-5 allocations. From Eric Dumazet. 10) Fix handling of default neigh garbage collection params, from Jiri Pirko. 11) Fix cwnd bloat and over-inflation of RTT when transmit segmentation is in use. From Eric Dumazet. 12) Missing initialization of Realtek r8169 driver's statistics seqlocks. Fix from Kyle McMartin. 13) Fix RTNL assertion failures in 802.3ad and AB ARP monitor of bonding driver, from Ding Tianhong. 14) Bonding slave release race can cause divide by zero, fix from Nikolay Aleksandrov. 15) Overzealous return from neigh_periodic_work() causes reachability time to not be computed. Fix from Duain Jiong. 16) Fix regression in ipv6_find_hdr(), it should not return -ENOENT when a specific target is specified and found. From Hans Schillstrom. 17) Fix VLAN tag stripping regression in BNA driver, from Ivan Vecera. 18) Tail loss probe can calculate bogus RTTs due to missing packet marking on retransmit. Fix from Yuchung Cheng. 19) We cannot do skb_dst_drop() in iptunnel_pull_header() because multicast loopback detection in later code paths need access to skb_rtable(). Fix from Xin Long. 20) The macvlan driver regresses in that it propagates lower device offload support disables into itself, causing severe slowdowns when running over a bridge. Provide the software offloads always on macvlan devices to deal with this and the regression is gone. From Vlad Yasevich. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits) macvlan: Add support for 'always_on' offload features net: sctp: fix sctp_sf_do_5_1D_ce to verify if we/peer is AUTH capable ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer net: cpsw: fix cpdma rx descriptor leak on down interface be2net: isolate TX workarounds not applicable to Skyhawk-R be2net: Fix skb double free in be_xmit_wrokarounds() failure path be2net: clear promiscuous bits in adapter->flags while disabling promiscuous mode be2net: Fix to reset transparent vlan tagging qlcnic: dcb: a couple off by one bugs tcp: fix bogus RTT on special retransmission hsr: off by one sanity check in hsr_register_frame_in() can: remove CAN FD compatibility for CAN 2.0 sockets can: flexcan: factor out soft reset into seperate funtion can: flexcan: flexcan_remove(): add missing netif_napi_del() can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze can: flexcan: factor out transceiver {en,dis}able into seperate functions can: flexcan: fix transition from and to low power mode in chip_{en,dis}able can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails can: flexcan: fix shutdown: first disable chip, then all interrupts USB AX88179/178A: Support D-Link DUB-1312 ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulatorLinus Torvalds authored
Pull regulator fixes from Mark Brown: "A couple of fixes here which ensure that regulators using the core support for GPIO enables work in all cases by ensuring that helpers are used consistently rather than open coding in places and hence not having GPIO support in some of them" * tag 'regulator-v3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: core: Replace direct ops->disable usage regulator: core: Replace direct ops->enable usage
-
Linus Torvalds authored
Merge misc fixes from Andrew Morton. * emailed patches from Andrew Morton akpm@linux-foundation.org>: mm: page_alloc: exempt GFP_THISNODE allocations from zone fairness mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS MAINTAINERS: add and correct types of some "T:" entries MAINTAINERS: use tab for separator rapidio/tsi721: fix tasklet termination in dma channel release hfsplus: fix remount issue zram: avoid null access when fail to alloc meta sh: prefix sh-specific "CCR" and "CCR2" by "SH_" ocfs2: fix quota file corruption drivers/rtc/rtc-s3c.c: fix incorrect way of save/restore of S3C2410_TICNT for TYPE_S3C64XX kallsyms: fix absolute addresses for kASLR scripts/gen_initramfs_list.sh: fix flags for initramfs LZ4 compression mm: include VM_MIXEDMAP flag in the VM_SPECIAL list to avoid m(un)locking memcg: reparent charges of children before processing parent memcg: fix endless loop in __mem_cgroup_iter_next() lib/radix-tree.c: swapoff tmpfs radix_tree: remember to rcu_read_unlock dma debug: account for cachelines and read-only mappings in overlap tracking mm: close PageTail race MAINTAINERS: EDAC: add Mauro and Borislav as interim patch collectors
-
Johannes Weiner authored
Jan Stancek reports manual page migration encountering allocation failures after some pages when there is still plenty of memory free, and bisected the problem down to commit 81c0a2bb ("mm: page_alloc: fair zone allocator policy"). The problem is that GFP_THISNODE obeys the zone fairness allocation batches on one hand, but doesn't reset them and wake kswapd on the other hand. After a few of those allocations, the batches are exhausted and the allocations fail. Fixing this means either having GFP_THISNODE wake up kswapd, or GFP_THISNODE not participating in zone fairness at all. The latter seems safer as an acute bugfix, we can clean up later. Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: <stable@kernel.org> [3.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-