Commits · 1169a965bbeef6f4e9eb925b509e19959c020d85 · Kirill Smelkov / linux

20 Jan, 2017 40 commits

Drivers: hv: Introduce a policy for controlling channel affinity · 1169a965

K. Y. Srinivasan authored Sep 02, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Introduce a mechanism to control how channels will be affinitized. We will
support two policies:

1. HV_BALANCED: All performance critical channels will be dstributed
evenly amongst all the available NUMA nodes. Once the Node is assigned,
we will assign the CPU based on a simple round robin scheme.

2. HV_LOCALIZED: Only the primary channels are distributed across all
NUMA nodes. Sub-channels will be in the same NUMA node as the primary
channel. This is the current behaviour.

The default policy will be the HV_BALANCED as it can minimize the remote
memory access on NUMA machines with applications that span NUMA nodes.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 509879bd)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

1169a965

Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() · 80c8b91d

Vitaly Kuznetsov authored Sep 02, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

With wrap around mappings in place we can always provide drivers with
direct links to packets on the ring buffer, even when they wrap around.
Do the required updates to get_next_pkt_raw()/put_pkt_raw()
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(back ported from commit bb08d431)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Conflicts:
	include/linux/hyperv.h
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

80c8b91d

Drivers: hv: ring_buffer: use wrap around mappings in hv_copy{from, to}_ringbuffer() · b3a904aa

Vitaly Kuznetsov authored Sep 02, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

With wrap around mappings for ring buffers we can always use a single
memcpy() to do the job.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f24f0b49)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

b3a904aa

Drivers: hv: ring_buffer: wrap around mappings for ring buffers · 06da14bf

Vitaly Kuznetsov authored Sep 02, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Make it possible to always use a single memcpy() or to provide a direct
link to a packet on the ring buffer by creating virtual mapping for two
copies of the ring buffer with vmap(). Utilize currently empty
hv_ringbuffer_cleanup() to do the unmap.

While on it, replace sizeof(struct hv_ring_buffer) check
in hv_ringbuffer_init() with BUILD_BUG_ON() as it is a compile time check.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9988ce68)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

06da14bf

Drivers: hv: cleanup vmbus_open() for wrap around mappings · fea14ef2

Vitaly Kuznetsov authored Sep 02, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

In preparation for doing wrap around mappings for ring buffers cleanup
vmbus_open() function:
- check that ring sizes are PAGE_SIZE aligned (they are for all in-kernel
  drivers now);
- kfree(open_info) on error only after we kzalloc() it (not an issue as it
  is valid to call kfree(NULL);
- rename poorly named labels;
- use alloc_pages() instead of __get_free_pages() as we need struct page
  pointer for future.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 98f531b1)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

fea14ef2

Drivers: hv: balloon: Use available memory value in pressure report · d7e08421

Alex Ng authored Aug 24, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Reports for available memory should use the si_mem_available() value.
The previous freeram value does not include available page cache memory.
Signed-off-by: Alex Ng <alexng@messages.microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b605c2d9)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

d7e08421

Drivers: hv: balloon: replace ha_region_mutex with spinlock · 909c2c2c

Vitaly Kuznetsov authored Aug 24, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

lockdep reports possible circular locking dependency when udev is used
for memory onlining:

 systemd-udevd/3996 is trying to acquire lock:
  ((memory_chain).rwsem){++++.+}, at: [<ffffffff810d137e>] __blocking_notifier_call_chain+0x4e/0xc0

 but task is already holding lock:
  (&dm_device.ha_region_mutex){+.+.+.}, at: [<ffffffffa015382e>] hv_memory_notifier+0x5e/0xc0 [hv_balloon]
 ...

which is probably a false positive because we take and release
ha_region_mutex from memory notifier chain depending on the arg. No real
deadlocks were reported so far (though I'm not really sure about
preemptible kernels...) but we don't really need to hold the mutex
for so long. We use it to protect ha_region_list (and its members) and the
num_pages_onlined counter. None of these operations require us to sleep
and nothing is slow, switch to using spinlock with interrupts disabled.

While on it, replace list_for_each -> list_for_each_entry as we actually
need entries in all these cases, drop meaningless list_empty() checks.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit eece30b9)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

909c2c2c

Drivers: hv: balloon: don't wait for ol_waitevent when memhp_auto_online is enabled · d8193846

Vitaly Kuznetsov authored Aug 24, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

With the recently introduced in-kernel memory onlining
(MEMORY_HOTPLUG_DEFAULT_ONLINE) these is no point in waiting for pages
to come online in the driver and we can get rid of the waiting.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a132c54c)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

d8193846

Drivers: hv: balloon: account for gaps in hot add regions · 10a9978f

Vitaly Kuznetsov authored Aug 24, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

I'm observing the following hot add requests from the WS2012 host:

hot_add_req: start_pfn = 0x108200 count = 330752
hot_add_req: start_pfn = 0x158e00 count = 193536
hot_add_req: start_pfn = 0x188400 count = 239616

As the host doesn't specify hot add regions we're trying to create
128Mb-aligned region covering the first request, we create the 0x108000 -
0x160000 region and we add 0x108000 - 0x158e00 memory. The second request
passes the pfn_covered() check, we enlarge the region to 0x108000 -
0x190000 and add 0x158e00 - 0x188200 memory. The problem emerges with the
third request as it starts at 0x188400 so there is a 0x200 gap which is
not covered. As the end of our region is 0x190000 now it again passes the
pfn_covered() check were we just adjust the covered_end_pfn and make it
0x188400 instead of 0x188200 which means that we'll try to online
0x188200-0x188400 pages but these pages were never assigned to us and we
crash.

We can't react to such requests by creating new hot add regions as it may
happen that the whole suggested range falls into the previously identified
128Mb-aligned area so we'll end up adding nothing or create intersecting
regions and our current logic doesn't allow that. Instead, create a list of
such 'gaps' and check for them in the page online callback.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit cb7a5724)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

10a9978f

Drivers: hv: balloon: keep track of where ha_region starts · 33a05bbb

Vitaly Kuznetsov authored Aug 24, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Windows 2012 (non-R2) does not specify hot add region in hot add requests
and the logic in hot_add_req() is trying to find a 128Mb-aligned region
covering the request. It may also happen that host's requests are not 128Mb
aligned and the created ha_region will start before the first specified
PFN. We can't online these non-present pages but we don't remember the real
start of the region.

This is a regression introduced by the commit 5abbbb75 ("Drivers: hv:
hv_balloon: don't lose memory when onlining order is not natural"). While
the idea of keeping the 'moving window' was wrong (as there is no guarantee
that hot add requests come ordered) we should still keep track of
covered_start_pfn. This is not a revert, the logic is different.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7cf3b79e)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

33a05bbb

Tools: hv: kvp: ensure kvp device fd is closed on exec · b54c9b9b

Vitaly Kuznetsov authored Jul 06, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

KVP daemon does fork()/exec() (with popen()) so we need to close our fds
to avoid sharing them with child processes. The immediate implication of
not doing so I see is SELinux complaining about 'ip' trying to access
'/dev/vmbus/hv_kvp'.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 26840437)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

b54c9b9b

Drivers: hv: vmbus: Implement a mechanism to tag the channel for low latency · 1191c2fb

K. Y. Srinivasan authored Jul 01, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

On Hyper-V, performance critical channels use the monitor
mechanism to signal the host when the guest posts mesages
for the host. This mechanism minimizes the hypervisor intercepts
and also makes the host more efficient in that each time the
host is woken up, it processes a batch of messages as opposed to
just one. The goal here is improve the throughput and this is at
the expense of increased latency.
Implement a mechanism to let the client driver decide if latency
is important.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3724287c)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

1191c2fb

Drivers: hv: vmbus: Reduce the delay between retries in vmbus_post_msg() · 261784a9

K. Y. Srinivasan authored Jul 01, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

The current delay between retries is unnecessarily high and is negatively
affecting the time it takes to boot the system.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 8de0d7e9)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

261784a9

Drivers: hv: vmbus: Enable explicit signaling policy for NIC channels · d188211e

K. Y. Srinivasan authored Jul 01, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

For synthetic NIC channels, enable explicit signaling policy as netvsc wants to
explicitly control when the host is to be signaled.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ccef9bcc)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

d188211e

Drivers: hv: vmbus: fix the race when querying & updating the percpu list · fe9934e7

Dexuan Cui authored Jun 09, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

There is a rare race when we remove an entry from the global list
hv_context.percpu_list[cpu] in hv_process_channel_removal() ->
percpu_channel_deq() -> list_del(): at this time, if vmbus_on_event() ->
process_chn_event() -> pcpu_relid2channel() is trying to query the list,
we can get the kernel fault.

Similarly, we also have the issue in the code path: vmbus_process_offer() ->
percpu_channel_enq().

We can resolve the issue by disabling the tasklet when updating the list.

The patch also moves vmbus_release_relid() to a later place where
the channel has been removed from the per-cpu and the global lists.
Reported-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 638fea33)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

fe9934e7

Drivers: hv: utils: fix a race on userspace daemons registration · 3c2cc50c

Vitaly Kuznetsov authored Jun 09, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Background: userspace daemons registration protocol for Hyper-V utilities
drivers has two steps:
1) daemon writes its own version to kernel
2) kernel reads it and replies with module version
at this point we consider the handshake procedure being completed and we
do hv_poll_channel() transitioning the utility device to HVUTIL_READY
state. At this point we're ready to handle messages from kernel.

When hvutil_transport is in HVUTIL_TRANSPORT_CHARDEV mode we have a
single buffer for outgoing message. hvutil_transport_send() puts to this
buffer and till the buffer is cleared with hvt_op_read() returns -EFAULT
to all consequent calls. Host<->guest protocol guarantees there is no more
than one request at a time and we will not get new requests till we reply
to the previous one so this single message buffer is enough.

Now to the race. When we finish negotiation procedure and send kernel
module version to userspace with hvutil_transport_send() it goes into the
above mentioned buffer and if the daemon is slow enough to read it from
there we can get a collision when a request from the host comes, we won't
be able to put anything to the buffer so the request will be lost. To
solve the issue we need to know when the negotiation is really done (when
the version message is read by the daemon) and transition to HVUTIL_READY
state after this happens. Implement a callback on read to support this.
Old style netlink communication is not affected by the change, we don't
really know when these messages are delivered but we don't have a single
message buffer there.
Reported-by: Barry Davis <barry_davis@stormagic.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit e0fa3e5e)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

3c2cc50c

Drivers: hv: get rid of timeout in vmbus_open() · 85ba4860

Vitaly Kuznetsov authored Jun 09, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

vmbus_teardown_gpadl() can result in infinite wait when it is called on 5
second timeout in vmbus_open(). The issue is caused by the fact that gpadl
teardown operation won't ever succeed for an opened channel and the timeout
isn't always enough. As a guest, we can always trust the host to respond to
our request (and there is nothing we can do if it doesn't).
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 396e287f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

85ba4860

Drivers: hv: don't leak memory in vmbus_establish_gpadl() · d42c8b04

Vitaly Kuznetsov authored Jun 03, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

In some cases create_gpadl_header() allocates submessages but we never
free them.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7cc80c98)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

d42c8b04

Drivers: hv: get rid of redundant messagecount in create_gpadl_header() · 0fcaaadc

Vitaly Kuznetsov authored Jun 03, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

We use messagecount only once in vmbus_establish_gpadl() to check if
it is safe to iterate through the submsglist. We can just initialize
the list header in all cases in create_gpadl_header() instead.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 4d637632)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

0fcaaadc

hv_netvsc: add ethtool statistics for tx packet issues · 5f0387a3

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Printing console messages is not helpful when system is out of memory;
and can be disastrous with netconsole. Instead keep statistics
of these anomalous conditions.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4323b47c)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

5f0387a3

hv_netvsc: report vmbus name in ethtool · e6571b7c

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Make netvsc on vmbus behave more like PCI.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e3f74b84)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

e6571b7c

hv_netvsc: make variable local · c5208b6a

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

The variable m_ret is only used in one basic block.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6c4c137e)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

c5208b6a

hv_netvsc: make netvsc_destroy_buf void · 129c8055

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

No caller checks the return value.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7a2a0a84)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

129c8055

hv_netvsc: refactor completion function · f46fb033

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Break the different cases, code is cleaner if broken up
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit bc304dd3)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

f46fb033

hv_netvsc: rearrange start_xmit · 9068290d

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Rearrange the transmit routine to eliminate goto's and unnecessary
boolean variables. Use standard functions to test for vlan tag.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0ab05141)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

9068290d

hv_netvsc: init completion during alloc · a4e86ba1

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Move initialization to allocate where other fields are initialized.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fd612602)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

a4e86ba1

hv_netvsc: make device_remove void · f5027ef6

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Always returns 0 and no callers check.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e08f3ea5)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

f5027ef6

hv_netvsc: use ARRAY_SIZE() for NDIS versions · 54c535cb

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Don't hard code size of array of NDIS versions.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e5a78fad)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

54c535cb

hv_netvsc: make inline functions static · c4fd285d

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Several new functions were introduced into hyperv.h but only used in one file.
Move them and let compiler decide on inline.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 30d1de08)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

c4fd285d

hv_netvsc: style cleanups · a077ec1a

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Fix most of the complaints about the style of the code.
Things like extra blank lines and return statements.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 796cc88c)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

a077ec1a

hv_netvsc: use kcalloc · e5b3e028

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Better to use kcalloc rather than kzalloc and multiply for an array.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e53a9c2a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

e5b3e028

hv_netvsc: make RSS hash key static · fc64501b

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 94773866)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

fc64501b

hv_netvsc: fix rtnl locking in callback · 165f4ed0

Stephen Hemminger authored Aug 23, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

The function get_netvsc_net_device had conditional locking. This was
unnecessary, incorrect, but harmless. It was unnecessary since the
code is only called from netlink netdev event callback where RTNL
is always acquired before the callbacks are run. It was incorrect
because of use of trylock and then continuing.
Fix by replacing with proper assertion.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8737caaf)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

165f4ed0

PCI: hv: Use list_move_tail() instead of list_del() + list_add_tail() · f112b677

Wei Yongjun authored Jul 28, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Use list_move_tail() instead of list_del() + list_add_tail().  No
functional change intended
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
(cherry picked from commit 4f1cb01a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

f112b677

hv_netvsc: Implement batching of receive completions · 88998ef9

Haiyang Zhang authored Aug 19, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

The existing code uses busy retry when unable to send out receive
completions due to full ring buffer. It also gives up retrying after limit
is reached, and causes receive buffer slots not being recycled.
This patch implements batching of receive completions. It also prevents
dropping receive completions due to full ring buffer.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c0b558e5)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

88998ef9

hv_netvsc: Add handler for physical link speed change · e74bdadd

Haiyang Zhang authored Aug 04, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

On Hyper-V host 2016 and later, VMs gets an event message of the physical
link speed when vSwitch is changed. This patch handles this message, so
the updated link speed can be reported by ethtool.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7f5d5af0)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

e74bdadd

hv_netvsc: Add query for initial physical link speed · f63b6d47

Haiyang Zhang authored Aug 04, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

The physical link speed value will be reported by ethtool command.
The real speed is available from Windows 2016 host or later.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b37879e6)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

f63b6d47

memory-hotplug: add automatic onlining policy for the newly added memory · 1b591833

Vitaly Kuznetsov authored Mar 15, 2016

BugLink: http://bugs.launchpad.net/bugs/1650059

Currently, all newly added memory blocks remain in 'offline' state
unless someone onlines them, some linux distributions carry special udev
rules like:

  SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"

to make this happen automatically.  This is not a great solution for
virtual machines where memory hotplug is being used to address high
memory pressure situations as such onlining is slow and a userspace
process doing this (udev) has a chance of being killed by the OOM killer
as it will probably require to allocate some memory.

Introduce default policy for the newly added memory blocks in
/sys/devices/system/memory/auto_online_blocks file with two possible
values: "offline" which preserves the current behavior and "online"
which causes all newly added memory blocks to go online as soon as
they're added.  The default is "offline".
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Daniel Kiper <daniel.kiper@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 31bc3858)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

1b591833

ibmveth: calculate gso_segs for large packets · bbce4b9e

Thomas Falcon authored Jan 10, 2017

BugLink: http://bugs.launchpad.net/bugs/1655420

Include calculations to compute the number of segments
that comprise an aggregated large packet.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 94acf164)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

bbce4b9e

ibmveth: set correct gso_size and gso_type · f3e31f52

Thomas Falcon authored Jan 10, 2017

BugLink: http://bugs.launchpad.net/bugs/1655420

This patch is based on an earlier one submitted
by Jon Maxwell with the following commit message:

"We recently encountered a bug where a few customers using ibmveth on the
same LPAR hit an issue where a TCP session hung when large receive was
enabled. Closer analysis revealed that the session was stuck because the
one side was advertising a zero window repeatedly.

We narrowed this down to the fact the ibmveth driver did not set gso_size
which is translated by TCP into the MSS later up the stack. The MSS is
used to calculate the TCP window size and as that was abnormally large,
it was calculating a zero window, even although the sockets receive buffer
was completely empty."

We rely on the Virtual I/O Server partition in a pseries
environment to provide the MSS through the TCP header checksum
field. The stipulation is that users should not disable checksum
offloading if rx packet aggregation is enabled through VIOS.

Some firmware offerings provide the MSS in the RX buffer.
This is signalled by a bit in the RX queue descriptor.
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Reviewed-by: Pradeep Satyanarayana <pradeeps@linux.vnet.ibm.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jonathan Maxwell <jmaxwell37@gmail.com>
Reviewed-by: David Dai <zdai@us.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7b596738)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

f3e31f52