Commits · 40d0b514c0e3fe400db305df9caeef89a3445d89 · Kirill Smelkov / linux

16 Sep, 2016 40 commits

hv_netvsc: fix bonding devices check in netvsc_netdev_event() · 40d0b514

Vitaly Kuznetsov authored Aug 15, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Bonding driver sets IFF_BONDING on both master (the bonding device) and
slave (the real NIC) devices and in netvsc_netdev_event() we want to skip
master devices only. Currently, there is an uncertainty when a slave
interface is removed: if bonding module comes first in netdev_chain it
clears IFF_BONDING flag on the netdev and netvsc_netdev_event() correctly
handles NETDEV_UNREGISTER event, but in case netvsc comes first on the
chain it sees the device with IFF_BONDING still attached and skips it. As
we still hold vf_netdev pointer to the device we crash on the next inject.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0dbff144)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

40d0b514

hv_netvsc: protect module refcount by checking net_device_ctx->vf_netdev · e2f20fb6

Vitaly Kuznetsov authored Aug 15, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

We're not guaranteed to see NETDEV_REGISTER/NETDEV_UNREGISTER notifications
only once per VF but we increase/decrease module refcount unconditionally.
Check vf_netdev to make sure we don't take/release it twice. We presume
that only one VF per netvsc device may exist.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0f20d795)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

e2f20fb6

hv_netvsc: reset vf_inject on VF removal · 54c9fccb

Vitaly Kuznetsov authored Aug 15, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

We reset vf_inject on VF going down (netvsc_vf_down()) but we don't on
VF removal (netvsc_unregister_vf()) so vf_inject stays 'true' while
vf_netdev is already NULL and we're trying to inject packets into NULL
net device in netvsc_recv_callback() causing kernel to crash.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 57c1826b)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

54c9fccb

hv_netvsc: avoid deadlocks between rtnl lock and vf_use_cnt wait · 5424f383

Vitaly Kuznetsov authored Aug 15, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Here is a deadlock scenario:
- netvsc_vf_up() schedules netvsc_notify_peers() work and quits.
- netvsc_vf_down() runs before netvsc_notify_peers() gets executed. As it
  is being executed from netdev notifier chain we hold rtnl lock when we
  get here.
- we enter while (atomic_read(&net_device_ctx->vf_use_cnt) != 0) loop and
  wait till netvsc_notify_peers() drops vf_use_cnt.
- netvsc_notify_peers() starts on some other CPU but netdev_notify_peers()
  will hang on rtnl_lock().
- deadlock!

Instead of introducing additional synchronization I suggest we drop
gwrk.dwrk completely and call NETDEV_NOTIFY_PEERS directly. As we're
acting under rtnl lock this is legitimate.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d072218f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

5424f383

hv_netvsc: don't lose VF information · e2d7d2a8

Vitaly Kuznetsov authored Aug 15, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

struct netvsc_device is not suitable for storing VF information as this
structure is being destroyed on MTU change / set channel operation (see
rndis_filter_device_remove()). Move all VF related stuff to struct
net_device_context which is persistent.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f9a7da91)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

e2d7d2a8

hv_netvsc: Fix VF register on bonding devices · c4bd91b5

Haiyang Zhang authored Jul 22, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Added a condition to avoid bonding devices with same MAC registering
as VF.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e2b9f1f7)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

c4bd91b5

PCI: hv: Fix interrupt cleanup path · 3f39c1ee

Cathy Avery authored Jul 12, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

SR-IOV disabled from the host causes a memory leak.  pci-hyperv usually
first receives a PCI_EJECT notification and then proceeds to delete the
hpdev list entry in hv_eject_device_work().  Later in hv_msi_free() since
the device is no longer on the device list hpdev is NULL and hv_msi_free
returns without freeing int_desc as part of hv_int_desc_free().
Signed-off-by: Cathy Avery <cavery@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jake Oshins <jakeo@microsoft.com>
(cherry picked from commit 0c6e617f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

3f39c1ee

x86/kernel: Audit and remove any unnecessary uses of module.h · 1a457308

Paul Gortmaker authored Jul 13, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends.  That changed
when we forked out support for the latter into the export.h file.

This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig.  The advantage
in doing so is that module.h itself sources about 15 other headers;
adding significantly to what we feed cpp, and it can obscure what
headers we are effectively using.

Since module.h was the source for init.h (for __init) and for
export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance
for the presence of either and replace as needed.  Build testing
revealed some implicit header usage that was fixed up accordingly.

Note that some bool/obj-y instances remain since module.h is
the header for some exception table entry stuff, and for things
like __init_or_module (code that is tossed when MODULES=n).
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160714001901.31603-4-paul.gortmaker@windriver.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 186f4360)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

1a457308

netvsc: Use the new in-place consumption APIs in the rx path · 000563a6

K. Y. Srinivasan authored Jul 05, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Use the new APIs for eliminating a copy on the receive path. These new APIs also
help in minimizing the number of memory barriers we end up issuing (in the
ringbuffer code) since we can better control when we want to expose the ring
state to the host.

The patch is being resent to address earlier email issues.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 99a50bb1)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

000563a6

PCI: hv: Handle all pending messages in hv_pci_onchannelcallback() · f47e0c02

Vitaly Kuznetsov authored Jun 17, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

When we have an interrupt from the host we have a bit set in event page
indicating there are messages for the particular channel.  We need to read
them all as we won't get signaled for what was on the queue before we
cleared the bit in vmbus_on_event().  This applies to all Hyper-V drivers
and the pass-through driver should do the same.

I did not meet any bugs; the issue was found by code inspection.  We don't
have many events going through hv_pci_onchannelcallback(), which explains
why nobody reported the issue before.

While on it, fix handling non-zero vmbus_recvpacket_raw() return values by
dropping out.  If the return value is not zero, it is wrong to inspect
buffer or bytes_recvd as these may contain invalid data.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jake Oshins <jakeo@microsoft.com>
(cherry picked from commit 837d741e)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

f47e0c02

PCI: hv: Don't leak buffer in hv_pci_onchannelcallback() · caee5e30

Vitaly Kuznetsov authored May 30, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

We don't free buffer on several code paths in hv_pci_onchannelcallback(),
put kfree() to the end of the function to fix the issue.  Direct { kfree();
return; } can now be replaced with a simple 'break';
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jake Oshins <jakeo@microsoft.com>
(cherry picked from commit 60fcdac8)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

caee5e30

netvsc: get rid of completion timeouts · 550163bc

Vitaly Kuznetsov authored Jun 09, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

I'm hitting 5 second timeout in rndis_filter_set_rss_param() while setting
RSS parameters for the device. When this happens we end up returning
-ETIMEDOUT from the function and rndis_filter_device_add() falls back to
setting

        net_device->max_chn = 1;
        net_device->num_chn = 1;
        net_device->num_sc_offered = 0;

but after a moment the rndis request succeeds and subchannels start to
appear. netvsc_sc_open() does unconditional nvscdev->num_sc_offered-- and
it becomes U32_MAX-1. Consequent rndis_filter_device_remove() will hang
while waiting for all U32_MAX-1 subchannels to appear and this is not
going to happen.

The immediate issue could be solved by adding num_sc_offered > 0 check to
netvsc_sc_open() but we're getting out of sync with the host and it's not
easy to adjust things later, e.g. in this particular case we'll be creating
queues without a user request for it and races are expected. Same applies
to other parts of the driver which have the same completion timeout.

Following the trend in drivers/hv/* code I suggest we remove all these
timeouts completely. As a guest we can always trust the host we're running
on and if the host screws things up there is no easy way to recover anyway.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5362855a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

550163bc

hv_netvsc: pass struct net_device to rndis_filter_set_offload_params() · 642ddcee

Vitaly Kuznetsov authored Jun 03, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

The only caller rndis_filter_device_add() has 'struct net_device' pointer
already.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 426d9541)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

642ddcee

hv_netvsc: pass struct net_device to rndis_filter_set_device_mac() · 266e0a2c

Vitaly Kuznetsov authored Jun 03, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

We unpack 'struct net_device' in netvsc_set_mac_addr() to get to
'struct hv_device' pointer which we use in rndis_filter_set_device_mac()
to get back to 'struct net_device'.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e834da9a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

266e0a2c

hv_netvsc: pass struct netvsc_device to rndis_filter_{open, close}() · 998c6801

Vitaly Kuznetsov authored Jun 03, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Both rndis_filter_open()/rndis_filter_close() use struct hv_device to
reach to struct netvsc_device only and all callers have it already.
While on it, rename net_device to nvdev in rndis_filter_open() as
net_device is misleading.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2f5fa6c8)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

998c6801

hv_netvsc: introduce {net, hv}_device_to_netvsc_device() helpers · 7117ad54

Vitaly Kuznetsov authored Jun 03, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Make it easier to get 'struct netvsc_device' from 'struct net_device' and
'struct hv_device' by introducing inline helpers.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2625466d)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

7117ad54

hv_netvsc: remove redundant assignment in netvsc_recv_callback() · 23085756

Vitaly Kuznetsov authored Jun 03, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

net_device_ctx is assigned in the very beginning of the function and 'net'
pointer doesn't change.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4baa994d)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

23085756

hv_netvsc: Fix VF register on vlan devices · 64cf6298

Haiyang Zhang authored Jun 02, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Added a condition to avoid vlan devices with same MAC registering
as VF.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit cb2911fe)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

64cf6298

hv_netvsc: set nvdev link after populating chn_table · cb34dd79

Vitaly Kuznetsov authored May 13, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Crash in netvsc_send() is observed when netvsc device is re-created on
mtu change/set channels. The crash is caused by dereferencing of NULL
channel pointer which comes from chn_table. The root cause is a mixture
of two facts:
- we set nvdev pointer in net_device_context in alloc_net_device()
  before we populate chn_table.
- we populate chn_table[0] only.

The issue could be papered over by checking channel != NULL in
netvsc_send() but populating the whole chn_table and writing the
nvdev pointer afterwards seems more appropriate.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 88098834)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

cb34dd79

hv_netvsc: synchronize netvsc_change_mtu()/netvsc_set_channels() with netvsc_remove() · 30b4420d

Vitaly Kuznetsov authored May 13, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

When netvsc device is removed during mtu change or channels setup we get
into troubles as both paths are trying to remove the device. Synchronize
them with start_remove flag and rtnl lock.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 6da7225f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

30b4420d

hv_netvsc: get rid of struct net_device pointer in struct netvsc_device · 584a270d

Vitaly Kuznetsov authored May 13, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Simplify netvsvc pointer graph by getting rid of the redundant ndev
pointer. We can always get a pointer to struct net_device from somewhere
else.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0a1275ca)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

584a270d

hv_netvsc: untangle the pointer mess · 449ca62c

Vitaly Kuznetsov authored May 13, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

We have the following structures keeping netvsc adapter state:
- struct net_device
- struct net_device_context
- struct netvsc_device
- struct rndis_device
- struct hv_device
and there are pointers/dependencies between them:
- struct net_device_context is contained in struct net_device
- struct hv_device has driver_data pointer which points to
  'struct net_device' OR 'struct netvsc_device' depending on driver's
  state (!).
- struct net_device_context has a pointer to 'struct hv_device'.
- struct netvsc_device has pointers to 'struct hv_device' and
  'struct net_device_context'.
- struct rndis_device has a pointer to 'struct netvsc_device'.

Different functions get different structures as parameters and use these
pointers for traveling. The problem is (in addition to keeping in mind
this complex graph) that some of these structures (struct netvsc_device
and struct rndis_device) are being removed and re-created on mtu change
(as we implement it as re-creation of hyper-v device) so our travel using
these pointers is dangerous.

Simplify this to a the following:
- add struct netvsc_device pointer to struct net_device_context (which is
  a part of struct net_device and thus never disappears)
- remove struct hv_device and struct net_device_context pointers from
  struct netvsc_device
- replace pointer to 'struct netvsc_device' with pointer to
  'struct net_device'.
- always keep 'struct net_device' in hv_device driver_data.

We'll end up with the following 'circular' structure:

net_device:
 [net_device_context] -> netvsc_device -> rndis_device -> net_device
                      -> hv_device -> net_device

On MTU change we'll be removing the 'netvsc_device -> rndis_device'
branch and re-creating it making the synchronization easier.

There is one additional redundant pointer left, it is struct net_device
link in struct netvsc_device, it is going to be removed in a separate
commit.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3d541ac5)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

449ca62c

hv_netvsc: use start_remove flag to protect netvsc_link_change() · e3d4997b

Vitaly Kuznetsov authored May 13, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

netvsc_link_change() can race with netvsc_change_mtu() or
netvsc_set_channels() as these functions destroy struct netvsc_device and
rndis filter. Use start_remove flag for syncronization. As
netvsc_change_mtu()/netvsc_set_channels() are called with rtnl lock held
we need to take it before checking start_remove value in
netvsc_link_change().
Reported-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1bdcec8a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

e3d4997b

hv_netvsc: move start_remove flag to net_device_context · 994b6bd0

Vitaly Kuznetsov authored May 13, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

struct netvsc_device is destroyed on mtu change so keeping the
protection flag there is not a good idea. Move it to struct
net_device_context which is preserved.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f580aec4)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

994b6bd0

tools: hv: lsvmbus: add pci pass-through UUID · 2e6291b1

Vitaly Kuznetsov authored Apr 30, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

lsvmbus keeps its own copy of all VMBus UUIDs, add PCIe pass-through
device there to not report 'Unknown' for such devices.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 552beb49)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

2e6291b1

Drivers: hv: balloon: reset host_specified_ha_region · 5e8453fd

Vitaly Kuznetsov authored Apr 30, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

We set host_specified_ha_region = true on certain request but this is a
global state which stays 'true' forever. We need to reset it when we
receive a request where ha_region is not specified. I did not see any
real issues, the bug was found by code inspection.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d19a55d6)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

5e8453fd

Drivers: hv: balloon: don't crash when memory is added in non-sorted order · 65604b05

Vitaly Kuznetsov authored Apr 30, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

When we iterate through all HA regions in handle_pg_range() we have an
assumption that all these regions are sorted in the list and the
'start_pfn >= has->end_pfn' check is enough to find the proper region.
Unfortunately it's not the case with WS2016 where host can hot-add regions
in a different order. We end up modifying the wrong HA region and crashing
later on pages online. Modify the check to make sure we found the region
we were searching for while iterating. Fix the same check in pfn_covered()
as well.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 77c0c973)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

65604b05

Drivers: hv: vmbus: handle various crash scenarios · 39d44ef9

Vitaly Kuznetsov authored Apr 30, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always
delivered to the CPU which was used for initial contact or to CPU0
depending on host version. vmbus_wait_for_unload() doesn't account for
the fact that in case we're crashing on some other CPU we won't get the
CHANNELMSG_UNLOAD_RESPONSE message and our wait on the current CPU will
never end.

Do the following:
1) Check for completion_done() in the loop. In case interrupt handler is
   still alive we'll get the confirmation we need.

2) Read message pages for all CPUs message page as we're unsure where
   CHANNELMSG_UNLOAD_RESPONSE is going to be delivered to. We can race with
   still-alive interrupt handler doing the same, add cmpxchg() to
   vmbus_signal_eom() to not lose CHANNELMSG_UNLOAD_RESPONSE message.

3) Cleanup message pages on all CPUs. This is required (at least for the
   current CPU as we're clearing CPU0 messages now but we may want to bring
   up additional CPUs on crash) as new messages won't be delivered till we
   consume what's pending. On boot we'll place message pages somewhere else
   and we won't be able to read stale messages.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit cd95aad5)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

39d44ef9

drivers:hv: Separate out frame buffer logic when picking MMIO range · ad8537af

Jake Oshins authored Apr 05, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Simplify the logic that picks MMIO ranges by pulling out the
logic related to trying to lay frame buffer claim on top of where
the firmware placed the frame buffer.
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ea37a6b8)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

ad8537af

drivers:hv: Track allocations of children of hv_vmbus in private resource tree · 3ad2e40a

Jake Oshins authored Apr 05, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

This patch changes vmbus_allocate_mmio() and vmbus_free_mmio() so
that when child paravirtual devices allocate memory-mapped I/O
space, they allocate it privately from a resource tree pointed
at by hyperv_mmio and also by the public resource tree
iomem_resource.  This allows the region to be marked as "busy"
in the private tree, but a "bridge window" in the public tree,
guaranteeing that no two bridge windows will overlap each other
but while also allowing the PCI device children of the bridge
windows to overlap that window.

One might conclude that this belongs in the pnp layer, rather
than in this driver.  Rafael Wysocki, the maintainter of the
pnp layer, has previously asked that we not modify the pnp layer
as it is considered deprecated.  This patch is thus essentially
a workaround.
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit be000f93)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

3ad2e40a

drivers:hv: Make a function to free mmio regions through vmbus · 68da7617

Jake Oshins authored Apr 05, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

This patch introduces a function that reverses everything
done by vmbus_allocate_mmio().  Existing code just called
release_mem_region().  Future patches in this series
require a more complex sequence of actions, so this function
is introduced to wrap those actions.
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(back ported from commit 97fb77dc)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

 Conflicts:
	drivers/hv/vmbus_drv.c
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

68da7617

drivers:hv: Lock access to hyperv_mmio resource tree · 027da541

Jake Oshins authored Apr 05, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

In existing code, this tree of resources is created
in single-threaded code and never modified after it is
created, and thus needs no locking.  This patch introduces
a semaphore for tree access, as other patches in this
series introduce run-time modifications of this resource
tree which can happen on multiple threads.
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(back ported from commit e16dad6b)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

 Conflicts:
	drivers/hv/vmbus_drv.c
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

027da541

Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets · 41d65dae

K. Y. Srinivasan authored Apr 02, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

Implement APIs for in-place consumption of vmbus packets. Currently, each
packet is copied and processed one at a time and as part of processing
each packet we potentially may signal the host (if it is waiting for
room to produce a packet).

These APIs help batched in-place processing of vmbus packets.
We also optimize host signaling by having a separate API to signal
the end of in-place consumption. With netvsc using these APIs,
on an iperf run on average I see about 20X reduction in checks to
signal the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ab028db4)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

41d65dae

Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h · b9d56f64

K. Y. Srinivasan authored Apr 02, 2016

BugLink: http://bugs.launchpad.net/bugs/1616677

In preparation for implementing APIs for in-place consumption of VMBUS
packets, movve some ring buffer functionality into hyperv.h
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(back ported from commit 687f32e6)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

 Conflicts:
	drivers/hv/ring_buffer.c
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

b9d56f64

asm-generic: implement virt_xxx memory barriers · 7c5e0b0f

Michael S. Tsirkin authored Dec 27, 2015

BugLink: http://bugs.launchpad.net/bugs/1616677

Guests running within virtual machines might be affected by SMP effects even if
the guest itself is compiled without SMP support.  This is an artifact of
interfacing with an SMP host while running an UP kernel.  Using mandatory
barriers for this use-case would be possible but is often suboptimal.

In particular, virtio uses a bunch of confusing ifdefs to work around
this, while xen just uses the mandatory barriers.

To better handle this case, low-level virt_mb() etc macros are made available.
These are implemented trivially using the low-level __smp_xxx macros,
the purpose of these wrappers is to annotate those specific cases.

These have the same effect as smp_mb() etc when SMP is enabled, but generate
identical code for SMP and non-SMP systems. For example, virtual machine guests
should use virt_mb() rather than smp_mb() when synchronizing against a
(possibly SMP) host.
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
(cherry picked from commit 6a65d263)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

7c5e0b0f

x86: define __smp_xxx · 6709b4ad

Michael S. Tsirkin authored Dec 27, 2015

BugLink: http://bugs.launchpad.net/bugs/1616677

This defines __smp_xxx barriers for x86,
for use by virtualization.

smp_xxx barriers are removed as they are
defined correctly by asm-generic/barriers.h
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit 1638fb72)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

6709b4ad

asm-generic: add __smp_xxx wrappers · 630b673a

Michael S. Tsirkin authored Dec 27, 2015

BugLink: http://bugs.launchpad.net/bugs/1616677

On !SMP, most architectures define their
barriers as compiler barriers.
On SMP, most need an actual barrier.

Make it possible to remove the code duplication for
!SMP by defining low-level __smp_xxx barriers
which do not depend on the value of SMP, then
use them from asm-generic conditionally.

Besides reducing code duplication, these low level APIs will also be
useful for virtualization, where a barrier is sometimes needed even if
!SMP since we might be talking to another kernel on the same SMP system.

Both virtio and Xen drivers will benefit.

The smp_xxx variants should use __smp_XXX ones or barrier() depending on
SMP, identically for all architectures.

We keep ifndef guards around them for now - once/if all
architectures are converted to use the generic
code, we'll be able to remove these.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
(cherry picked from commit a9e4252a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

630b673a

x86: reuse asm-generic/barrier.h · 96d2d8c1

Michael S. Tsirkin authored Dec 21, 2015

BugLink: http://bugs.launchpad.net/bugs/1616677

As on most architectures, on x86 read_barrier_depends and
smp_read_barrier_depends are empty.  Drop the local definitions and pull
the generic ones from asm-generic/barrier.h instead: they are identical.

This is in preparation to refactoring this code area.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit 300b06d4)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

96d2d8c1

asm-generic: guard smp_store_release/load_acquire · bc9ebdb3

Michael S. Tsirkin authored Dec 27, 2015

BugLink: http://bugs.launchpad.net/bugs/1616677

Allow architectures to override smp_store_release
and smp_load_acquire by guarding the defines
in asm-generic/barrier.h with ifndef directives.

This is in preparation to reusing asm-generic/barrier.h
on architectures which have their own definition
of these macros.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
(cherry picked from commit 57f7c037)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

bc9ebdb3

lcoking/barriers, arch: Use smp barriers in smp_store_release() · 65191e24

Davidlohr Bueso authored Dec 31, 2015

BugLink: http://bugs.launchpad.net/bugs/1616677

With commit b92b8b35 ("locking/arch: Rename set_mb() to smp_store_mb()")
it was made clear that the context of this call (and thus set_mb)
is strictly for CPU ordering, as opposed to IO. As such all archs
should use the smp variant of mb(), respecting the semantics and
saving a mandatory barrier on UP.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: dave@stgolabs.net
Link: http://lkml.kernel.org/r/1445975631-17047-3-git-send-email-dave@stgolabs.netSigned-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
(cherry picked from commit 5a1b26d7)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>

65191e24