Commits · 0b5d404e349c0236b11466c0a4785520c0be6982 · nexedi / linux

02 Sep, 2010 6 commits

pkt_sched: Fix lockdep warning on est_tree_lock in gen_estimator · 0b5d404e

Jarek Poplawski authored Sep 02, 2010

This patch fixes a lockdep warning:

[  516.287584] =========================================================
[  516.288386] [ INFO: possible irq lock inversion dependency detected ]
[  516.288386] 2.6.35b #7
[  516.288386] ---------------------------------------------------------
[  516.288386] swapper/0 just changed the state of lock:
[  516.288386]  (&qdisc_tx_lock){+.-...}, at: [<c12eacda>] est_timer+0x62/0x1b4
[  516.288386] but this lock took another, SOFTIRQ-unsafe lock in the past:
[  516.288386]  (est_tree_lock){+.+...}
[  516.288386] 
[  516.288386] and interrupts could create inverse lock ordering between them.
...

So, est_tree_lock needs BH protection because it's taken by
qdisc_tx_lock, which is used both in BH and process contexts.
(Full warning with this patch at netdev, 02 Sep 2010.)

Fixes commit: ae638c47
("pkt_sched: gen_estimator: add a new lock")
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0b5d404e

ipvs: avoid oops for passive FTP · 7bcbf81a

Julian Anastasov authored Sep 01, 2010

Fix Passive FTP problem in ip_vs_ftp:

- Do not oops in nf_nat_set_seq_adjust (adjust_tcp_sequence) when
  iptable_nat module is not loaded
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

7bcbf81a

Revert "sky2: don't do GRO on second port" · 5e4e7573

David S. Miller authored Sep 02, 2010

This reverts commit de6be6c1.

After some discussion with Jarek Poplawski and Eric Dumazet, we've
decided that this change is incorrect.
Signed-off-by: David S. Miller <davem@davemloft.net>

5e4e7573

gro: fix different skb headrooms · 3d3be433

Eric Dumazet authored Sep 01, 2010

Packets entering GRO might have different headrooms, even for a given
flow (because of implementation details in drivers, like copybreak).
We cant force drivers to deliver packets with a fixed headroom.

1) fix skb_segment()

skb_segment() makes the false assumption headrooms of fragments are same
than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
errors, and crash later in skb_copy_and_csum_dev()

2) allocate a minimal skb for head of frag_list

skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
allocate a fresh skb. This adds NET_SKB_PAD to a padding already
provided by netdevice, depending on various things, like copybreak.

Use alloc_skb() to allocate an exact padding, to reduce cache line
needs:
NET_SKB_PAD + NET_IP_ALIGN

bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626

Many thanks to Plamen Petrov, testing many debugging patches !
With help of Jarek Poplawski.
Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3d3be433

bridge: Clear INET control block of SKBs passed into ip_fragment(). · 87f94b4e

David S. Miller authored Sep 01, 2010

In a similar vain to commit 17762060
("bridge: Clear IPCB before possible entry into IP stack")

Any time we call into the IP stack we have to make sure the state
there is as expected by the ipv4 code.

With help from Eric Dumazet and Herbert Xu.
Reported-by: Bandan Das <bandan.das@stratus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

87f94b4e

3c59x: Remove incorrect locking; correct documented lock hierarchy · 24cd804d

Ben Hutchings authored Aug 30, 2010

vortex_ioctl() was grabbing vortex_private::lock around its call to
generic_mii_ioctl().  This is no longer necessary since there are more
specific locks which the mdio_{read,write}() functions will obtain.
Worse, those functions do not save and restore IRQ flags when locking
the MII state, so interrupts will be enabled when generic_mii_ioctl()
returns.

Since there is currently no need for any function to call
mdio_{read,write}() while holding another spinlock, do not change them
to save and restore IRQ flags but remove the specification of ordering
between vortex_private::lock and vortex_private::mii_lock.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

24cd804d

01 Sep, 2010 9 commits

sky2: don't do GRO on second port · de6be6c1

stephen hemminger authored Aug 30, 2010

There's something very important I forgot to tell you.
 What?

 Don't cross the GRO streams.
 Why?

 It would be bad.
 I'm fuzzy on the whole good/bad thing. What do you mean, "bad"?

 Try to imagine all the Internet as you know it stopping instantaneously
  and every bit in every packet swapping at the speed of light.
 Total packet reordering.
 Right. That's bad. Okay. All right. Important safety tip. Thanks, Hubert

The simplest way to stop this is just avoid doing GRO on the second port.
Very few Marvell boards support two ports per ring, and GRO is just
an optimization.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

de6be6c1

ipv4: minor fix about RPF in help of Kconfig · 750e9fad

Nicolas Dichtel authored Aug 31, 2010

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

750e9fad

xfrm_user: avoid a warning with some compiler · 928497f0

Nicolas Dichtel authored Aug 31, 2010

Attached is a small patch to remove a warning ("warning: ISO C90 forbids
mixed declarations and code" with gcc 4.3.2).
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

928497f0

net/sched/sch_hfsc.c: initialize parent's cl_cfmin properly in init_vf() · 3b2eb613

Michal Soltys authored Aug 30, 2010

This patch fixes init_vf() function, so on each new backlog period parent's
cl_cfmin is properly updated (including further propgation towards the root),
even if the activated leaf has no upperlimit curve defined.
Signed-off-by: Michal Soltys <soltys@ziu.info>
Signed-off-by: David S. Miller <davem@davemloft.net>

3b2eb613

pxa168_eth: fix a mdiobus leak · 9c01ae58

Denis Kirjanov authored Aug 29, 2010

mdiobus resources must be released on exit
Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Acked-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9c01ae58

net sched: fix kernel leak in act_police · 0f04cfd0

Jeff Mahoney authored Aug 31, 2010

While reviewing commit 1c40be12, I
 audited other users of tc_action_ops->dump for information leaks.

 That commit covered almost all of them but act_police still had a leak.

 opt.limit and opt.capab aren't zeroed out before the structure is
 passed out.

 This patch uses the C99 initializers to zero everything unused out.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0f04cfd0

vhost: stop worker only if created · 78b620ce

Eric Dumazet authored Aug 31, 2010

Its currently illegal to call kthread_stop(NULL)
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

78b620ce

MAINTAINERS: Add ehea driver as Supported · aa8a9e25

Breno Leitao authored Sep 01, 2010

This change just add the IBM eHEA 10Gb network drivers as supported.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aa8a9e25

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 · a3f86ec0
David S. Miller authored Sep 01, 2010

a3f86ec0

31 Aug, 2010 5 commits

ath9k_hw: fix parsing of HT40 5 GHz CTLs · 90487974

Luis R. Rodriguez authored Aug 30, 2010

The 5 GHz CTL indexes were not being read for all hardware
devices due to the masking out through the CTL_MODE_M mask
being one bit too short. Without this the calibrated regulatory
maximum values were not being picked up when devices operate
on 5 GHz in HT40 mode. The final output power used for Atheros
devices is the minimum between the calibrated CTL values and
what CRDA provides.

Cc: stable@kernel.org [2.6.27+]
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

90487974

ath9k_hw: Fix EEPROM uncompress block reading on AR9003 · 803288e6

Luis R. Rodriguez authored Aug 30, 2010

The EEPROM is compressed on AR9003, upon decompression
the wrong upper limit was being used for the block which
prevented the 5 GHz CTL indexes from being used, which are
stored towards the end of the EEPROM block. This fix allows
the actual intended regulatory limits to be used on AR9003
hardware.

Cc: stable@kernel.org [2.6.36+]
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

803288e6

wireless: register wiphy rfkill w/o holding cfg80211_mutex · c3d34d5d

John W. Linville authored Aug 30, 2010

Otherwise lockdep complains...

https://bugzilla.kernel.org/show_bug.cgi?id=17311

[ INFO: possible circular locking dependency detected ]
2.6.36-rc2-git4 #12
-------------------------------------------------------
kworker/0:3/3630 is trying to acquire lock:
 (rtnl_mutex){+.+.+.}, at: [<ffffffff813396c7>] rtnl_lock+0x12/0x14

but task is already holding lock:
 (rfkill_global_mutex){+.+.+.}, at: [<ffffffffa014b129>]
rfkill_switch_all+0x24/0x49 [rfkill]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (rfkill_global_mutex){+.+.+.}:
       [<ffffffff81079ad7>] lock_acquire+0x120/0x15b
       [<ffffffff813ae869>] __mutex_lock_common+0x54/0x52e
       [<ffffffff813aede9>] mutex_lock_nested+0x34/0x39
       [<ffffffffa014b4ab>] rfkill_register+0x2b/0x29c [rfkill]
       [<ffffffffa0185ba0>] wiphy_register+0x1ae/0x270 [cfg80211]
       [<ffffffffa0206f01>] ieee80211_register_hw+0x1b4/0x3cf [mac80211]
       [<ffffffffa0292e98>] iwl_ucode_callback+0x9e9/0xae3 [iwlagn]
       [<ffffffff812d3e9d>] request_firmware_work_func+0x54/0x6f
       [<ffffffff81065d15>] kthread+0x8c/0x94
       [<ffffffff8100ac24>] kernel_thread_helper+0x4/0x10

-> #1 (cfg80211_mutex){+.+.+.}:
       [<ffffffff81079ad7>] lock_acquire+0x120/0x15b
       [<ffffffff813ae869>] __mutex_lock_common+0x54/0x52e
       [<ffffffff813aede9>] mutex_lock_nested+0x34/0x39
       [<ffffffffa018605e>] cfg80211_get_dev_from_ifindex+0x1b/0x7c [cfg80211]
       [<ffffffffa0189f36>] cfg80211_wext_giwscan+0x58/0x990 [cfg80211]
       [<ffffffff8139a3ce>] ioctl_standard_iw_point+0x1a8/0x272
       [<ffffffff8139a529>] ioctl_standard_call+0x91/0xa7
       [<ffffffff8139a687>] T.723+0xbd/0x12c
       [<ffffffff8139a727>] wext_handle_ioctl+0x31/0x6d
       [<ffffffff8133014e>] dev_ioctl+0x63d/0x67a
       [<ffffffff8131afd9>] sock_ioctl+0x48/0x21d
       [<ffffffff81102abd>] do_vfs_ioctl+0x4ba/0x509
       [<ffffffff81102b5d>] sys_ioctl+0x51/0x74
       [<ffffffff81009e02>] system_call_fastpath+0x16/0x1b

-> #0 (rtnl_mutex){+.+.+.}:
       [<ffffffff810796b0>] __lock_acquire+0xa93/0xd9a
       [<ffffffff81079ad7>] lock_acquire+0x120/0x15b
       [<ffffffff813ae869>] __mutex_lock_common+0x54/0x52e
       [<ffffffff813aede9>] mutex_lock_nested+0x34/0x39
       [<ffffffff813396c7>] rtnl_lock+0x12/0x14
       [<ffffffffa0185cb5>] cfg80211_rfkill_set_block+0x1a/0x7b [cfg80211]
       [<ffffffffa014aed0>] rfkill_set_block+0x80/0xd5 [rfkill]
       [<ffffffffa014b07e>] __rfkill_switch_all+0x3f/0x6f [rfkill]
       [<ffffffffa014b13d>] rfkill_switch_all+0x38/0x49 [rfkill]
       [<ffffffffa014b821>] rfkill_op_handler+0x105/0x136 [rfkill]
       [<ffffffff81060708>] process_one_work+0x248/0x403
       [<ffffffff81062620>] worker_thread+0x139/0x214
       [<ffffffff81065d15>] kthread+0x8c/0x94
       [<ffffffff8100ac24>] kernel_thread_helper+0x4/0x10
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>

c3d34d5d

netlink: Make NETLINK_USERSOCK work again. · b963ea89

David S. Miller authored Aug 30, 2010

Once we started enforcing the a nl_table[] entry exist for
a protocol, NETLINK_USERSOCK stopped working.  Add a dummy
table entry so that it works again.
Reported-by: Thomas Voegtle <tv@lio96.de>
Tested-by: Thomas Voegtle <tv@lio96.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

b963ea89

irda: Correctly clean up self->ias_obj on irda_bind() failure. · 628e300c

David S. Miller authored Aug 30, 2010

If irda_open_tsap() fails, the irda_bind() code tries to destroy
the ->ias_obj object by hand, but does so wrongly.

In particular, it fails to a) release the hashbin attached to the
object and b) reset the self->ias_obj pointer to NULL.

Fix both problems by using irias_delete_object() and explicitly
setting self->ias_obj to NULL, just as irda_release() does.
Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

628e300c

30 Aug, 2010 5 commits

wireless extensions: fix kernel heap content leak · 42da2f94

Johannes Berg authored Aug 30, 2010

Wireless extensions have an unfortunate, undocumented
requirement which requires drivers to always fill
iwp->length when returning a successful status. When
a driver doesn't do this, it leads to a kernel heap
content leak when userspace offers a larger buffer
than would have been necessary.

Arguably, this is a driver bug, as it should, if it
returns 0, fill iwp->length, even if it separately
indicated that the buffer contents was not valid.

However, we can also at least avoid the memory content
leak if the driver doesn't do this by setting the iwp
length to max_tokens, which then reflects how big the
buffer is that the driver may fill, regardless of how
big the userspace buffer is.

To illustrate the point, this patch also fixes a
corresponding cfg80211 bug (since this requirement
isn't documented nor was ever pointed out by anyone
during code review, I don't trust all drivers nor
all cfg80211 handlers to implement it correctly).

Cc: stable@kernel.org [all the way back]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

42da2f94

MAINTAINERS: change broken url for prism54 · 9ef80804

John W. Linville authored Aug 27, 2010

Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

9ef80804

mac80211: delete work timer · 071249b1

Johannes Berg authored Aug 25, 2010

The new workqueue changes helped me find this bug
that's been lingering since the changes to the work
processing in mac80211 -- the work timer is never
deleted properly. Do that to avoid having it fire
after all data structures have been freed. It can't
be re-armed because all it will do, if running, is
schedule the work, but that gets flushed later and
won't have anything to do since all work items are
gone by now (by way of interface removal).

Cc: stable@kernel.org [2.6.34+]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

071249b1

p54: fix tx feedback status flag check · f880c205

Christian Lamparter authored Aug 24, 2010

Michael reported that p54* never really entered power
save mode, even tough it was enabled.

It turned out that upon a power save mode change the
firmware will set a special flag onto the last outgoing
frame tx status (which in this case is almost always the
designated PSM nullfunc frame). This flag confused the
driver; It erroneously reported transmission failures
to the stack, which then generated the next nullfunc.
and so on...

Cc: <stable@kernel.org>
Reported-by: Michael Buesch <mb@bu3sch.de>
Tested-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

f880c205

ath5k: check return value of ieee80211_get_tx_rate · d8e1ba76

John W. Linville authored Aug 24, 2010

This avoids a NULL pointer dereference as reported here:

	https://bugzilla.redhat.com/show_bug.cgi?id=625889

When the WARN condition is hit in ieee80211_get_tx_rate, it will return
NULL.  So, we need to check the return value and avoid dereferencing it
in that case.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: stable@kernel.org
Acked-by: Bob Copeland <me@bobcopeland.com>

d8e1ba76

28 Aug, 2010 2 commits

pcnet_cs: add new_id · 7619b1b2

Ken Kawasaki authored Aug 28, 2010

pcnet_cs:
    add new_id: "KENTRONICS KEP-230" 10Base-T PCMCIA card.
Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>

7619b1b2

net/ipv4: Eliminate kstrdup memory leak · c34186ed

Julia Lawall authored Aug 27, 2010

The string clone is only used as a temporary copy of the argument val
within the while loop, and so it should be freed before leaving the
function.  The call to strsep, however, modifies clone, so a pointer to the
front of the string is kept in saved_clone, to make it possible to free it.

The sematic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
local idexpression x;
expression E;
identifier l;
statement S;
@@

*x= \(kasprintf\|kstrdup\)(...);
...
if (x == NULL) S
... when != kfree(x)
    when != E = x
if (...) {
  <... when != kfree(x)
* goto l;
  ...>
* return ...;
}
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

c34186ed

27 Aug, 2010 1 commit

libertas: if_sdio: fix buffer alignment in struct if_sdio_card · 557de5eb

Mike Rapoport authored Aug 22, 2010

The commit 886275ce (param: lock
if_sdio's lbs_helper_name and lbs_fw_name against sysfs changes)
introduced new fields into the if_sdio_card structure. It caused
missalignment of the if_sdio_card.buffer field and failure at driver
load time:

  ~# modprobe libertas_sdio
  [   62.315124] libertas_sdio: Libertas SDIO driver
  [   62.319976] libertas_sdio: Copyright Pierre Ossman
  [   63.020629] DMA misaligned error with device 48
  [   63.025207] mmci-omap-hs mmci-omap-hs.1: unexpected dma status 800
  [   66.005035] libertas: command 0x0003 timed out
  [   66.009826] libertas: Timeout submitting command 0x0003
  [   66.016296] libertas: PREP_CMD: command 0x0003 failed: -110

Adding explicit alignment attribute for the if_sdio_card.buffer field
fixes this problem.
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Acked-by: Marek Vasut <marek.vasut@gmail.com>
Acked-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

557de5eb

26 Aug, 2010 6 commits

net/caif/cfrfml.c: use asm/unaligned.h · 7e368739

Jeff Mahoney authored Aug 26, 2010

caif does not build on ia64 starting with 2.6.32-rc1. Using
asm/unaligned.h instead of linux/unaligned/le_byteshift.h fixes the issue.

include/linux/unaligned/le_byteshift.h:40:50: error: redefinition of 'get_unaligned_le16'
include/linux/unaligned/le_byteshift.h:45:50: error: redefinition of 'get_unaligned_le32'
include/linux/unaligned/le_byteshift.h:50:50: error: redefinition of 'get_unaligned_le64'
include/linux/unaligned/le_byteshift.h:55:51: error: redefinition of 'put_unaligned_le16'
include/linux/unaligned/le_byteshift.h:60:51: error: redefinition of 'put_unaligned_le32'
include/linux/unaligned/le_byteshift.h:65:51: error: redefinition of 'put_unaligned_le64'
include/linux/unaligned/le_struct.h:31:51: note: previous definition of 'put_unaligned_le64' was here
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e368739

ax25: missplaced sock_put(sk) · d71b0e9c

Bernard Pidoux F6BVP authored Aug 26, 2010

This patch moves a missplaced sock_put(sk) after
bh_unlock_sock(sk)
like in other parts of AX25 driver.
Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

d71b0e9c

qlge: reset the chip before freeing the buffers · fe5f0980

Breno Leitao authored Aug 26, 2010

Qlge is freeing the buffers before stopping the card DMA, and
this can cause some severe error, as a EEH event on PPC.

This patch just stop the card and then free the resources.
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fe5f0980

l2tp: test for ethernet header in l2tp_eth_dev_recv() · bfc960a8

Eric Dumazet authored Aug 25, 2010

close https://bugzilla.kernel.org/show_bug.cgi?id=16529

Before calling dev_forward_skb(), we should make sure skb head contains
at least an ethernet header, even if length included in upper layer said
so. Use pskb_may_pull() to make sure this ethernet header is present in
skb head.
Reported-by: Thomas Heil <heil@terminal-consulting.de>
Reported-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bfc960a8

tcp: select(writefds) don't hang up when a peer close connection · d84ba638

KOSAKI Motohiro authored Aug 24, 2010

This issue come from ruby language community. Below test program
hang up when only run on Linux.

	% uname -mrsv
	Linux 2.6.26-2-486 #1 Sat Dec 26 08:37:39 UTC 2009 i686
	% ruby -rsocket -ve '
	BasicSocket.do_not_reverse_lookup = true
	serv = TCPServer.open("127.0.0.1", 0)
	s1 = TCPSocket.open("127.0.0.1", serv.addr[1])
	s2 = serv.accept
	s2.close
	s1.write("a") rescue p $!
	s1.write("a") rescue p $!
	Thread.new {
	  s1.write("a")
	}.join'
	ruby 1.9.3dev (2010-07-06 trunk 28554) [i686-linux]
	#<Errno::EPIPE: Broken pipe>
	[Hang Here]

FreeBSD, Solaris, Mac doesn't. because Ruby's write() method call
select() internally. and tcp_poll has a bug.

SUS defined 'ready for writing' of select() as following.

|  A descriptor shall be considered ready for writing when a call to an output
|  function with O_NONBLOCK clear would not block, whether or not the function
|  would transfer data successfully.

That said, EPIPE situation is clearly one of 'ready for writing'.

We don't have read-side issue because tcp_poll() already has read side
shutdown care.

|        if (sk->sk_shutdown & RCV_SHUTDOWN)
|                mask |= POLLIN | POLLRDNORM | POLLRDHUP;

So, Let's insert same logic in write side.

- reference url
  http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31065
  http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31068Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d84ba638

tcp: fix three tcp sysctls tuning · c5ed63d6

Eric Dumazet authored Aug 25, 2010

As discovered by Anton Blanchard, current code to autotune 
tcp_death_row.sysctl_max_tw_buckets, sysctl_tcp_max_orphans and
sysctl_max_syn_backlog makes little sense.

The bigger a page is, the less tcp_max_orphans is : 4096 on a 512GB
machine in Anton's case.

(tcp_hashinfo.bhash_size * sizeof(struct inet_bind_hashbucket))
is much bigger if spinlock debugging is on. Its wrong to select bigger
limits in this case (where kernel structures are also bigger)

bhash_size max is 65536, and we get this value even for small machines. 

A better ground is to use size of ehash table, this also makes code
shorter and more obvious.

Based on a patch from Anton, and another from David.
Reported-and-tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c5ed63d6

25 Aug, 2010 1 commit

tcp: Combat per-cpu skew in orphan tests. · ad1af0fe

David S. Miller authored Aug 25, 2010

As reported by Anton Blanchard when we use
percpu_counter_read_positive() to make our orphan socket limit checks,
the check can be off by up to num_cpus_online() * batch (which is 32
by default) which on a 128 cpu machine can be as large as the default
orphan limit itself.

Fix this by doing the full expensive sum check if the optimized check
triggers.
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

ad1af0fe

24 Aug, 2010 5 commits

pxa168_eth: silence gcc warnings · b2bc8563

Dan Carpenter authored Aug 24, 2010

Casting "pep->tx_desc_dma" to to a struct tx_desc pointer makes gcc
complain:

drivers/net/pxa168_eth.c:657: warning:
	cast to pointer from integer of different size
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b2bc8563

pxa168_eth: update call to phy_mii_ioctl() · 4f2c8510

Dan Carpenter authored Aug 24, 2010

The phy_mii_ioctl() function changed recently.  It now takes a struct
ifreq pointer directly.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4f2c8510

pxa168_eth: fix error handling in prope · 945c7c73

Dan Carpenter authored Aug 24, 2010

A couple issues here:
* Some resources weren't released.
* If alloc_etherdev() failed it would have caused a NULL dereference
  because "pep" would be null when we checked "if (pep->clk)".
* Also it's better to propagate the error codes from mdiobus_register()
  instead of just returning -ENOMEM.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

945c7c73

pxa168_eth: remove unneeded null check · 4169591f

Dan Carpenter authored Aug 24, 2010

"pep->pd" isn't checked consistently in this function.  For example it's
dereferenced unconditionally on the next line after the end of the if
condition.  This function is only called from pxa168_eth_probe() and
pep->pd is always non-NULL so I removed the check.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4169591f

phylib: Fix race between returning phydev and calling adjust_link · ef24b16b

Anton Vorontsov authored Aug 24, 2010

It is possible that phylib will call adjust_link before returning
from {,of_}phy_connect(), which may cause the following [very rare,
though] oops upon reopening the device:

  Unable to handle kernel paging request for data at address 0x0000024c
  Oops: Kernel access of bad area, sig: 11 [#1]
  PREEMPT SMP NR_CPUS=2 LTT NESTING LEVEL : 0
  P1021 RDB
  Modules linked in:
  NIP: c0345dac LR: c0345dac CTR: c0345d84
  TASK = dffab6b0[30] 'events/0' THREAD: c0d24000 CPU: 0
  [...]
  NIP [c0345dac] adjust_link+0x28/0x19c
  LR [c0345dac] adjust_link+0x28/0x19c
  Call Trace:
  [c0d25f00] [000045e1] 0x45e1 (unreliable)
  [c0d25f30] [c036c158] phy_state_machine+0x3ac/0x554
  [...]

Here is why. Drivers store phydev in their private structures, e.g.
gianfar driver:

static int init_phy(struct net_device *dev)
{
	...
	priv->phydev = of_phy_connect(...);
	...
}

So that adjust_link could retrieve it back:

static void adjust_link(struct net_device *dev)
{
	...
	struct phy_device *phydev = priv->phydev;
	...
}

If the device has been opened before, then phydev->state is set to
PHY_HALTED (or undefined if the driver didn't call phy_stop()).

Now, phy_connect starts the PHY state machine before returning phydev to
the driver:

	phy_start_machine(phydev, NULL);

	if (phydev->irq > 0)
		phy_start_interrupts(phydev);

	return phydev;

The time between 'phy_start_machine()' and 'return phydev' is undefined.
The start machine routine delays execution for 1 second, which is enough
for most cases. But under heavy load, or if you're unlucky, it is quite
possible that PHY state machine will execute before phy_connect()
returns, and so adjust_link callback will try to dereference phydev,
which is not yet ready.

To fix the issue, simply initialize the PHY's state to PHY_READY during
phy_attach(). This will ensure that phylib won't call adjust_link before
phy_start().
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ef24b16b