Commits · 5e9965c15ba88319500284e590733f4a4629a288 · Kirill Smelkov / linux

23 Jul, 2012 1 commit

David S. Miller authored Jul 22, 2012

The ipv4 routing cache is non-deterministic, performance wise, and is
subject to reasonably easy to launch denial of service attacks.

The routing cache works great for well behaved traffic, and the world
was a much friendlier place when the tradeoffs that led to the routing
cache's design were considered.

What it boils down to is that the performance of the routing cache is
a product of the traffic patterns seen by a system rather than being a
product of the contents of the routing tables.  The former of which is
controllable by external entitites.

Even for "well behaved" legitimate traffic, high volume sites can see
hit rates in the routing cache of only ~%10.

The general flow of this patch series is that first the routing cache
is removed.  We build a completely new rtable entry every lookup
request.

Next we make some simplifications due to the fact that removing the
routing cache causes several members of struct rtable to become no
longer necessary.

Then we need to make some amends such that we can legally cache
pre-constructed routes in the FIB nexthops.  Firstly, we need to
invalidate routes which are hit with nexthop exceptions.  Secondly we
have to change the semantics of rt->rt_gateway such that zero means
that the destination is on-link and non-zero otherwise.

Now that the preparations are ready, we start caching precomputed
routes in the FIB nexthops.  Output and input routes need different
kinds of care when determining if we can legally do such caching or
not.  The details are in the commit log messages for those changes.

The patch series then winds down with some more struct rtable
simplifications and other tidy ups that remove unnecessary overhead.

On a SPARC-T3 output route lookups are ~876 cycles.  Input route
lookups are ~1169 cycles with rpfilter disabled, and about ~1468
cycles with rpfilter enabled.

These measurements were taken with the kbench_mod test module in the
net_test_tools GIT tree:

git://git.kernel.org/pub/scm/linux/kernel/git/davem/net_test_tools.git

That GIT tree also includes a udpflood tester tool and stresses
route lookups on packet output.

For example, on the same SPARC-T3 system we can run:

	time ./udpflood -l 10000000 10.2.2.11

with routing cache:
real    1m21.955s       user    0m6.530s        sys     1m15.390s

without routing cache:
real    1m31.678s       user    0m6.520s        sys     1m25.140s

Performance undoubtedly can easily be improved further.

For example fib_table_lookup() performs a lot of excessive
computations with all the masking and shifting, some of it
conditionalized to deal with edge cases.

Also, Eric's no-ref optimization for input route lookups can be
re-instated for the FIB nexthop caching code path.  I would be really
pleased if someone would work on that.

In fact anyone suitable motivated can just fire up perf on the loading
of the test net_test_tools benchmark kernel module.  I spend much of
my time going:

bash# perf record insmod ./kbench_mod.ko dst=172.30.42.22 src=74.128.0.1 iif=2
bash# perf report

Thanks to helpful feedback from Joe Perches, Eric Dumazet, Ben
Hutchings, and others.
Signed-off-by: David S. Miller <davem@davemloft.net>

5e9965c1

22 Jul, 2012 22 commits

net: ethernet: davinci_emac: add pm_runtime support · 3ba97381

Mark A. Greer authored Jul 20, 2012

Add pm_runtime support to the TI Davinci EMAC driver.

CC: Sekhar Nori <nsekhar@ti.com>
CC: Kevin Hilman <khilman@ti.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3ba97381

net: ethernet: davinci_emac: Remove unnecessary #include · c6f0b4ea

Mark A. Greer authored Jul 20, 2012

The '#include <mach/mux.h>' line in davinci_emac.c
causes a compile error because that header file
isn't found.  It turns out that the #include isn't
needed because the driver isn't (and shoudn't be)
touching the mux anyway, so remove it.

CC: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c6f0b4ea

net-next: minor cleanups for bonding documentation · f8b72d36

Rick Jones authored Jul 20, 2012

The section titled "Configuring Bonding for Maximum Throughput" is
actually section twelve not thirteen, and there are a couple of words
spelled incorrectly.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
Reviewed-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f8b72d36

net: netprio_cgroup: rework update socket logic · 406a3c63

John Fastabend authored Jul 20, 2012

Instead of updating the sk_cgrp_prioidx struct field on every send
this only updates the field when a task is moved via cgroup
infrastructure.

This allows sockets that may be used by a kernel worker thread
to be managed. For example in the iscsi case today a user can
put iscsid in a netprio cgroup and control traffic will be sent
with the correct sk_cgrp_prioidx value set but as soon as data
is sent the kernel worker thread isssues a send and sk_cgrp_prioidx
is updated with the kernel worker threads value which is the
default case.

It seems more correct to only update the field when the user
explicitly sets it via control group infrastructure. This allows
the users to manage sockets that may be used with other threads.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

406a3c63

tun: experimental zero copy tx support · 0690899b

Michael S. Tsirkin authored Jul 20, 2012

Let vhost-net utilize zero copy tx when used with tun.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0690899b

skbuff: export skb_copy_ubufs · dcc0fb78

Michael S. Tsirkin authored Jul 20, 2012

Export skb_copy_ubufs so that modules can orphan frags.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dcc0fb78

net: orphan frags on receive · 1080e512

Michael S. Tsirkin authored Jul 20, 2012

zero copy packets are normally sent to the outside
network, but bridging, tun etc might loop them
back to host networking stack. If this happens
destructors will never be called, so orphan
the frags immediately on receive.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1080e512

tun: orphan frags on xmit · 868eefeb

Michael S. Tsirkin authored Jul 20, 2012

tun xmit is actually receive of the internal tun
socket. Orphan the frags same as we do for normal rx path.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

868eefeb

skbuff: convert to skb_orphan_frags · 70008aa5

Michael S. Tsirkin authored Jul 20, 2012

Reduce code duplication a bit using the new helper.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

70008aa5

skbuff: add an api to orphan frags · a353e0ce

Michael S. Tsirkin authored Jul 20, 2012

Many places do
       if ((skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY))
		skb_copy_ubufs(skb, gfp_mask);
to copy and invoke frag destructors if necessary.
Add an inline helper for this.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a353e0ce

ixgbe: Fix build with PCI_IOV enabled. · d47e12d6
David S. Miller authored Jul 22, 2012
```
Signed-off-by: David S. Miller <davem@davemloft.net>
```
d47e12d6

forcedeth: advertise transmit time stamping · 7491302d

Richard Cochran authored Jul 22, 2012

This driver now offers software transmit time stamping, so it should
advertise that fact via ethtool. Compile tested only.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>

Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7491302d

e1000e: advertise transmit time stamping · 7b1115e0

Richard Cochran authored Jul 22, 2012

This driver now offers software transmit time stamping, so it should
advertise that fact via ethtool. Compile tested only.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>

Cc: Willem de Bruijn <willemb@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: e1000-devel@lists.sourceforge.net
Signed-off-by: David S. Miller <davem@davemloft.net>

7b1115e0

e1000: advertise transmit time stamping · e10df2c6

Richard Cochran authored Jul 22, 2012

This driver now offers software transmit time stamping, so it should
advertise that fact via ethtool. Compile tested only.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>

Cc: Willem de Bruijn <willemb@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: e1000-devel@lists.sourceforge.net
Signed-off-by: David S. Miller <davem@davemloft.net>

e10df2c6

bnx2x: advertise transmit time stamping · be53ce1e

Richard Cochran authored Jul 22, 2012

This driver now offers software transmit time stamping, so it should
advertise that fact via ethtool. Compile tested only.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

be53ce1e

rtnl: Add #ifdef CONFIG_RPS around num_rx_queues reference · 1d69c2b3

Mark A. Greer authored Jul 20, 2012

Commit 76ff5cc9
(rtnl: allow to specify number of rx and tx queues
on device creation) added a reference to the net_device
structure's 'num_rx_queues' member in

	net/core/rtnetlink.c:rtnl_fill_ifinfo()

However, the definition for 'num_rx_queues' is surrounded
by an '#ifdef CONFIG_RPS' while the new reference to it is
not.  This causes a compile error when CONFIG_RPS is not
defined.

Fix the compile error by surrounding the new reference to
'num_rx_queues' by an '#ifdef CONFIG_RPS'.

CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1d69c2b3

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next · fd183f6a

David S. Miller authored Jul 22, 2012

Jeff Kirsher says:

--------------------
This series contains updates to ixgbe and ixgbevf.
 ...
Akeem G. Abodunrin (1):
  igb: reset PHY in the link_up process to recover PHY setting after
    power down.

Alexander Duyck (8):
  ixgbe: Drop probe_vf and merge functionality into ixgbe_enable_sriov
  ixgbe: Change how we check for pre-existing and assigned VFs
  ixgbevf: Add lock around mailbox ops to prevent simultaneous access
  ixgbevf: Add support for PCI error handling
  ixgbe: Fix handling of FDIR_HASH flag
  ixgbe: Reduce Rx header size to what is actually used
  ixgbe: Use num_tcs.pg_tcs as upper limit for TC when checking based
    on UP
  ixgbe: Use 1TC DCB instead of disabling DCB for MSI and legacy
    interrupts

Don Skidmore (1):
  ixgbe: add support for new 82599 device

Greg Rose (1):
  ixgbevf: Fix namespace issue with ixgbe_write_eitr

John Fastabend (2):
  ixgbe: fix RAR entry counting for generic and fdb_add()
  ixgbe: remove extra unused queues in DCB + FCoE case
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

fd183f6a

Merge branch 'vhost-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 5dc7c779
David S. Miller authored Jul 22, 2012

5dc7c779

wimax: fix printk format warnings · 2a304bb8

Randy Dunlap authored Jul 21, 2012

Fix printk format warnings in drivers/net/wimax/i2400m:

drivers/net/wimax/i2400m/control.c: warning: format '%zu' expects argument of type 'size_t', but argument 4 has type 'ssize_t' [-Wformat]
drivers/net/wimax/i2400m/control.c: warning: format '%zu' expects argument of type 'size_t', but argument 5 has type 'ssize_t' [-Wformat]
drivers/net/wimax/i2400m/usb-fw.c: warning: format '%zu' expects argument of type 'size_t', but argument 4 has type 'ssize_t' [-Wformat]

I don't see these warnings on x86. The warnings that are quoted above
are from Geert's kernel build reports.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: linux-wimax@intel.com
Cc: wimax@linuxwimax.org
Signed-off-by: David S. Miller <davem@davemloft.net>

2a304bb8

sctp: Implement quick failover draft from tsvwg · 5aa93bcf

Neil Horman authored Jul 21, 2012

I've seen several attempts recently made to do quick failover of sctp transports
by reducing various retransmit timers and counters.  While its possible to
implement a faster failover on multihomed sctp associations, its not
particularly robust, in that it can lead to unneeded retransmits, as well as
false connection failures due to intermittent latency on a network.

Instead, lets implement the new ietf quick failover draft found here:
http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05

This will let the sctp stack identify transports that have had a small number of
errors, and avoid using them quickly until their reliability can be
re-established.  I've tested this out on two virt guests connected via multiple
isolated virt networks and believe its in compliance with the above draft and
works well.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Sridhar Samudrala <sri@us.ibm.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
CC: joe@perches.com
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5aa93bcf

net: fix race condition in several drivers when reading stats · e3906486

Kevin Groeneveld authored Jul 21, 2012

Fix race condition in several network drivers when reading stats on 32bit
UP architectures.  These drivers update their stats in a BH context and
therefore should use u64_stats_fetch_begin_bh/u64_stats_fetch_retry_bh
instead of u64_stats_fetch_begin/u64_stats_fetch_retry when reading the
stats.
Signed-off-by: Kevin Groeneveld <kgroeneveld@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e3906486

ipv4: tcp: set unicast_sock uc_ttl to -1 · 0980e56e

Eric Dumazet authored Jul 20, 2012

Set unicast_sock uc_ttl to -1 so that we select the right ttl,
instead of sending packets with a 0 ttl.

Bug added in commit be9f4a44 (ipv4: tcp: remove per net tcp_sock)
Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0980e56e

21 Jul, 2012 15 commits

igb: reset PHY in the link_up process to recover PHY setting after power down. · 76886596

Akeem G. Abodunrin authored Jul 17, 2012

There was a previous patch to resolve issue with 82576 losing PHY setting
after PHY power down. However that previous implementation triggered speed
mismatch and occasional link lost. Now, this patch resolves both initial
PHY setting and speed mismatch issues.
Signed-off-by: Akeem G. Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

76886596

ixgbe: Use 1TC DCB instead of disabling DCB for MSI and legacy interrupts · b724e9f2

Alexander Duyck authored Jul 17, 2012

This change makes it so that we can use 1TC DCB in the case of MSI and
legacy interrupts.  The advantage to this is that it allows us to fully
support FCoE w/ DCB instead of having to drop to link flow control only
when using these interrupt modes.

Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b724e9f2

ixgbe: add support for new 82599 device · b6dfd939

Don Skidmore authored Jul 11, 2012

This patch adds support for a new 82599 device that supports WoL.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b6dfd939

ixgbe: remove extra unused queues in DCB + FCoE case · 3f4a6f00

John Fastabend authored Jun 05, 2012

With DCB and FCoE configured extra queues may be allocated and
never used. After this patch we calculate the max correctly.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

3f4a6f00

ixgbe: fix RAR entry counting for generic and fdb_add() · 95447461

John Fastabend authored May 31, 2012

Do RAR entry accounting correctly so that errors are reported and
promisc mode is set correctly when the number of entries exceeds
the hardware limits.

This can happen with many macvlan devices attached to the PF or
by adding many fdb entries in SR-IOV modes.

Also this includes a small refactor to fdb_add() to avoid having so
many nested if/else statements after adding a check for the number
or RAR entries.

The max entries for the PF is currently 16 we allow 15 additional
entries to account for the defined MAC.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

95447461

ixgbe: Use num_tcs.pg_tcs as upper limit for TC when checking based on UP · b92ad72d

Alexander Duyck authored May 25, 2012

This change makes it so the function ixgbe_dcb_get_tc_from_up will use the
num_tcs.pg_tcs to determine the starting value for determining a traffic
class based on a user priority.  The main motivation for this change is to
address possible bad configurations in which more TCs worth of data are
populated then there are actual TCs.  By limiting this value we can at
least make certain we are not providing a map with values that are out of
range.

As a result any user priorities that are setup in the configuration with a
traffic class mapping higher than what the hardware supports will be
reported as being on TC 0.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b92ad72d

ixgbe: Reduce Rx header size to what is actually used · 252562c2

Alexander Duyck authored May 24, 2012

The recent changes to netdev_alloc_skb actually make it so that the size of
the buffer now actually has a more direct input on the truesize.  So in
order to make best use of the piece of a page we are allocated I am
reducing the IXGBE_RX_HDR_SIZE to 256 so that our truesize will be reduced
by 256 bytes as well.

This should result in performance improvements since the number of uses per
page should increase from 4 to 6 in the case of a 4K page.  In addition we
should see socket performance improvements due to the truesize dropping
to less than 1K for buffers less than 256 bytes.

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

252562c2

ixgbevf: Fix namespace issue with ixgbe_write_eitr · ce422606

Greg Rose authored May 22, 2012

Make the function static to cleanup namespace.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <Sibai.li@intel.com
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ce422606

ixgbe: Fix handling of FDIR_HASH flag · 39cb681b

Alexander Duyck authored Jun 06, 2012

This change makes it so that we can use the atr_sample_rate to determine if
we are capable of supporting ATR. The advantage to this approach is that it
allows us to now determine the setting of the IXGBE_FLAG_FDIR_HASH_CAPABLE
based on the queueing scheme, instead of the queueing scheme being based on
the flag.

Using this approach there are essentially 5 conditions that must be checked
prior to trying to enable ATR:
1.  Is SR-IOV disabled?
2.  Are the number of TCs <= 1?
3.  Is RSS queueing limit greater than 1?
4.  Is atr_sample_rate set?
5.  Is Flow Director perfect filtering disabled?

If any of these conditions are enabled they should disable ATR filtering.
Note that in the case of conditions 1 through 4 being met we will set
things up for ATR queueing, however if test 5 fails we will still leave the
queues allocated for use by perfect filters.  The reason for this is to
allow for us to switch back and forth between ntuple and ATR without
needing to reallocate the descriptor rings.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

39cb681b

ixgbevf: Add support for PCI error handling · 9f19f31d

Alexander Duyck authored May 11, 2012

This change adds support for handling IO errors and slot resets.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

9f19f31d

ixgbevf: Add lock around mailbox ops to prevent simultaneous access · 1c55ed76

Alexander Duyck authored May 11, 2012

This change adds a spinlock around the mailbox accesses to prevent
simultaneous access to the mailboxes.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

1c55ed76

ixgbe: Change how we check for pre-existing and assigned VFs · 9297127b

Alexander Duyck authored May 23, 2012

This patch does two things. First it drops the unnecessary work of
searching for enabled VFs when we first bring up the adapter and instead
just uses pci_num_vf to determine how many VFs are enabled on the adapter.

The second thing it does is drop the use of vfdev from the vf_data_storage
structure. Instead we just search the entire system for a VF that has us
as it's PF, and then if that VF is assigned we indicate that the VFs are
assigned. This allows us to still check for assigned VFs even if the
vfinfo allocation has failed, or vfinfo has been freed.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

9297127b

ixgbe: Drop probe_vf and merge functionality into ixgbe_enable_sriov · 99d74487

Alexander Duyck authored May 09, 2012

This is meant to fix a bug in which we were not checking for pre-existing
VFs if we were not setting the max_vfs value at driver load. What happens
now is that we always call ixgbe_enable_sriov and this checks for
pre-existing VFs ore requested VFs prior to deciding on no SR-IOV.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

99d74487

vhost: make vhost work queue visible · 163049ae

Stefan Hajnoczi authored Jul 21, 2012

The vhost work queue allows processing to be done in vhost worker thread
context, which uses the owner process mm.  Access to the vring and guest
memory is typically only possible from vhost worker context so it is
useful to allow work to be queued directly by users.

Currently vhost_net only uses the poll wrappers which do not expose the
work queue functions.  However, for tcm_vhost (vhost_scsi) it will be
necessary to queue custom work.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

163049ae

vhost: Separate vhost-net features from vhost features · 0dd05a3b

Stefan Hajnoczi authored Jul 21, 2012

In order for other vhost devices to use the VHOST_FEATURES bits the
vhost-net specific bits need to be moved to their own VHOST_NET_FEATURES
constant.

(Asias: Update drivers/vhost/test.c to use VHOST_NET_FEATURES)
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Zhi Yong Wu <wuzhy@cn.ibm.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Asias He <asias@redhat.com>
Signed-off-by: Nicholas A. Bellinger <nab@risingtidesystems.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

0dd05a3b

20 Jul, 2012 2 commits

forcedeth: spin_unlock_irq in interrupt handler fix · 186e8687

Denis Efremov authored Jul 21, 2012

The replacement of spin_lock_irq/spin_unlock_irq pair in interrupt
handler by spin_lock_irqsave/spin_lock_irqrestore pair.

Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Denis Efremov <yefremov.denis@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

186e8687

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch · c073cfc8

David S. Miller authored Jul 20, 2012

Jesse Gross says:

====================
A few bug fixes and small enhancements for net-next/3.6.
 ...
Ansis Atteka (1):
      openvswitch: Do not send notification if ovs_vport_set_options() failed

Ben Pfaff (1):
      openvswitch: Check gso_type for correct sk_buff in queue_gso_packets().

Jesse Gross (2):
      openvswitch: Enable retrieval of TCP flags from IPv6 traffic.
      openvswitch: Reset upper layer protocol info on internal devices.

Leo Alterman (1):
      openvswitch: Fix typo in documentation.

Pravin B Shelar (1):
      openvswitch: Check currect return value from skb_gso_segment()

Raju Subramanian (1):
      openvswitch: Replace Nicira Networks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c073cfc8