1. 08 Oct, 2013 16 commits
  2. 07 Oct, 2013 8 commits
    • Eric W. Biederman's avatar
      net: Update the sysctl permissions handler to test effective uid/gid · 88ba09df
      Eric W. Biederman authored
      On Tue, 20 Aug 2013 11:40:04 -0500 Eric Sandeen <sandeen@redhat.com> wrote:
      > This was brought up in a Red Hat bug (which may be marked private, I'm sorry):
      >
      > Bug 987055 - open O_WRONLY succeeds on some root owned files in /proc for process running with unprivileged EUID
      >
      > "On RHEL7 some of the files in /proc can be opened for writing by an unprivileged EUID."
      >
      > The flaw existed upstream as well last I checked.
      >
      > This commit in kernel v3.8 caused the regression:
      >
      > commit cff10976
      > Author: Eric W. Biederman <ebiederm@xmission.com>
      > Date:   Fri Nov 16 03:03:01 2012 +0000
      >
      >     net: Update the per network namespace sysctls to be available to the network namespace owner
      >
      >     - Allow anyone with CAP_NET_ADMIN rights in the user namespace of the
      >       the netowrk namespace to change sysctls.
      >     - Allow anyone the uid of the user namespace root the same
      >       permissions over the network namespace sysctls as the global root.
      >     - Allow anyone with gid of the user namespace root group the same
      >       permissions over the network namespace sysctl as the global root group.
      >
      >     Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
      >     Signed-off-by: David S. Miller <davem@davemloft.net>
      >
      > because it changed /sys/net's special permission handler to test current_uid, not
      > current_euid; same for current_gid/current_egid.
      >
      > So in this case, root cannot drop privs via set[ug]id, and retains all privs
      > in this codepath.
      
      Modify the code to use current_euid(), and in_egroup_p, as in done
      in fs/proc/proc_sysctl.c:test_perm()
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarEric Sandeen <sandeen@redhat.com>
      Reported-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88ba09df
    • Marc Kleine-Budde's avatar
      can: dev: fix nlmsg size calculation in can_get_size() · fe119a05
      Marc Kleine-Budde authored
      This patch fixes the calculation of the nlmsg size, by adding the missing
      nla_total_size().
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe119a05
    • Jiri Benc's avatar
      ipv4: fix ineffective source address selection · 0a7e2260
      Jiri Benc authored
      When sending out multicast messages, the source address in inet->mc_addr is
      ignored and rewritten by an autoselected one. This is caused by a typo in
      commit 813b3b5d ("ipv4: Use caller's on-stack flowi as-is in output
      route lookups").
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a7e2260
    • Markus Pargmann's avatar
    • Markus Pargmann's avatar
      net: ethernet: cpsw: Search childs for slave nodes · f468b10e
      Markus Pargmann authored
      The current implementation searches the whole DT for nodes named
      "slave".
      
      This patch changes it to search only child nodes for slaves.
      Signed-off-by: default avatarMarkus Pargmann <mpa@pengutronix.de>
      Acked-by: default avatarMugunthan V N <mugunthanvnm@ti.com>
      Acked-by: default avatarPeter Korsgaard <jacmet@sunsite.dk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f468b10e
    • Alexei Starovoitov's avatar
      net: fix unsafe set_memory_rw from softirq · d45ed4a4
      Alexei Starovoitov authored
      on x86 system with net.core.bpf_jit_enable = 1
      
      sudo tcpdump -i eth1 'tcp port 22'
      
      causes the warning:
      [   56.766097]  Possible unsafe locking scenario:
      [   56.766097]
      [   56.780146]        CPU0
      [   56.786807]        ----
      [   56.793188]   lock(&(&vb->lock)->rlock);
      [   56.799593]   <Interrupt>
      [   56.805889]     lock(&(&vb->lock)->rlock);
      [   56.812266]
      [   56.812266]  *** DEADLOCK ***
      [   56.812266]
      [   56.830670] 1 lock held by ksoftirqd/1/13:
      [   56.836838]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380
      [   56.849757]
      [   56.849757] stack backtrace:
      [   56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45
      [   56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
      [   56.882004]  ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007
      [   56.895630]  ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001
      [   56.909313]  ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001
      [   56.923006] Call Trace:
      [   56.929532]  [<ffffffff8175a145>] dump_stack+0x55/0x76
      [   56.936067]  [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208
      [   56.942445]  [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50
      [   56.948932]  [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150
      [   56.955470]  [<ffffffff810ccb52>] mark_lock+0x282/0x2c0
      [   56.961945]  [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50
      [   56.968474]  [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50
      [   56.975140]  [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90
      [   56.981942]  [<ffffffff810cef72>] lock_acquire+0x92/0x1d0
      [   56.988745]  [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
      [   56.995619]  [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50
      [   57.002493]  [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
      [   57.009447]  [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380
      [   57.016477]  [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380
      [   57.023607]  [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460
      [   57.030818]  [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10
      [   57.037896]  [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0
      [   57.044789]  [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0
      [   57.051720]  [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40
      [   57.058727]  [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40
      [   57.065577]  [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30
      [   57.072338]  [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0
      [   57.078962]  [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0
      [   57.085373]  [<ffffffff81058245>] run_ksoftirqd+0x35/0x70
      
      cannot reuse jited filter memory, since it's readonly,
      so use original bpf insns memory to hold work_struct
      
      defer kfree of sk_filter until jit completed freeing
      
      tested on x86_64 and i386
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d45ed4a4
    • Oussama Ghorbel's avatar
      ipv6: Allow the MTU of ipip6 tunnel to be set below 1280 · 582442d6
      Oussama Ghorbel authored
      The (inner) MTU of a ipip6 (IPv4-in-IPv6) tunnel cannot be set below 1280, which is the minimum MTU in IPv6.
      However, there should be no IPv6 on the tunnel interface at all, so the IPv6 rules should not apply.
      More info at https://bugzilla.kernel.org/show_bug.cgi?id=15530
      
      This patch allows to check the minimum MTU for ipv6 tunnel according to these rules:
      -In case the tunnel is configured with ipip6 mode the minimum MTU is 68.
      -In case the tunnel is configured with ip6ip6 or any mode the minimum MTU is 1280.
      Signed-off-by: default avatarOussama Ghorbel <ou.ghorbel@gmail.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      582442d6
    • Michael S. Tsirkin's avatar
      netif_set_xps_queue: make cpu mask const · 3573540c
      Michael S. Tsirkin authored
      virtio wants to pass in cpumask_of(cpu), make parameter
      const to avoid build warnings.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3573540c
  3. 04 Oct, 2013 1 commit
    • Eric Dumazet's avatar
      tcp: do not forget FIN in tcp_shifted_skb() · 5e8a402f
      Eric Dumazet authored
      Yuchung found following problem :
      
       There are bugs in the SACK processing code, merging part in
       tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
       skbs FIN flag. When a receiver first SACK the FIN sequence, and later
       throw away ofo queue (e.g., sack-reneging), the sender will stop
       retransmitting the FIN flag, and hangs forever.
      
      Following packetdrill test can be used to reproduce the bug.
      
      $ cat sack-merge-bug.pkt
      `sysctl -q net.ipv4.tcp_fack=0`
      
      // Establish a connection and send 10 MSS.
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +.000 bind(3, ..., ...) = 0
      +.000 listen(3, 1) = 0
      
      +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
      +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
      +.001 < . 1:1(0) ack 1 win 1024
      +.000 accept(3, ..., ...) = 4
      
      +.100 write(4, ..., 12000) = 12000
      +.000 shutdown(4, SHUT_WR) = 0
      +.000 > . 1:10001(10000) ack 1
      +.050 < . 1:1(0) ack 2001 win 257
      +.000 > FP. 10001:12001(2000) ack 1
      +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop>
      +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop>
      // SACK reneg
      +.050 < . 1:1(0) ack 12001 win 257
      +0 %{ print "unacked: ",tcpi_unacked }%
      +5 %{ print "" }%
      
      First, a typo inverted left/right of one OR operation, then
      code forgot to advance end_seq if the merged skb carried FIN.
      
      Bug was added in 2.6.29 by commit 832d11c5
      ("tcp: Try to restore large SKBs while SACK processing")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Acked-by: default avatarIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e8a402f
  4. 03 Oct, 2013 4 commits
  5. 02 Oct, 2013 11 commits
    • David S. Miller's avatar
      Merge branch 'mv643xx' · 569943d0
      David S. Miller authored
      Sebastian Hesselbarth says:
      
      ====================
      This patch set comprises some one-liners to fix issues with repeated
      loading and unloading of a modular mv643xx_eth driver.
      
      First two patches take care of the periodic port statistic timer, that
      updates statistics by reading port registers using add_timer/mod_timer.
      
      Patch 1 moves timer re-schedule from mib_counters_update to the timer
      callback. As mib_counters_update is also called from non-timer context,
      this ensures the timer is reactivated from timer context only.
      
      Patch 2 moves initial timer schedule from _probe() time to right before
      the port is actually started as the corresponding del_timer_sync is at
      _stop() time. This fixes a regression, where unloading the driver from a
      non-started eth device can cause the timer to access deallocated mem.
      
      Patch 3 adds an assignment of the ports device_node to the corresponding
      self-created platform_device. This is required to allow fixups based on
      the device_node's compatible string later. Actually, it is also a potential
      regression because we already check compatible string for Kirkwood, but
      does not (yet) rely on the fixup.
      
      All patches are based on v3.12-rc3 and have been tested on Kirkwood-based
      Seagate Dockstar.
      
      Patches 1 and 2 can also possibly queued up for -stable.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      569943d0
    • Sebastian Hesselbarth's avatar
      net: mv643xx_eth: fix missing device_node for port devices · b5d82db8
      Sebastian Hesselbarth authored
      DT-based mv643xx_eth probes and creates platform_devices for the
      port devices on its own. To allow fixups for ports based on the
      device_node, we need to set .of_node of the corresponding device
      with the correct node.
      Signed-off-by: default avatarSebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Acked-by: default avatarJason Cooper <jason@lakedaemon.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5d82db8
    • Sebastian Hesselbarth's avatar
      net: mv643xx_eth: fix orphaned statistics timer crash · f564412c
      Sebastian Hesselbarth authored
      The periodic statistics timer gets started at port _probe() time, but
      is stopped on _stop() only. In a modular environment, this can cause
      the timer to access already deallocated memory, if the module is unloaded
      without starting the eth device. To fix this, we add the timer right
      before the port is started, instead of at _probe() time.
      Signed-off-by: default avatarSebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Acked-by: default avatarJason Cooper <jason@lakedaemon.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f564412c
    • Sebastian Hesselbarth's avatar
      net: mv643xx_eth: update statistics timer from timer context only · 041b4ddb
      Sebastian Hesselbarth authored
      Each port driver installs a periodic timer to update port statistics
      by calling mib_counters_update. As mib_counters_update is also called
      from non-timer context, we should not reschedule the timer there but
      rather move it to timer-only context.
      Signed-off-by: default avatarSebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      Acked-by: default avatarJason Cooper <jason@lakedaemon.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      041b4ddb
    • François Cachereul's avatar
      l2tp: fix kernel panic when using IPv4-mapped IPv6 addresses · e18503f4
      François Cachereul authored
      IPv4 mapped addresses cause kernel panic.
      The patch juste check whether the IPv6 address is an IPv4 mapped
      address. If so, use IPv4 API instead of IPv6.
      
      [  940.026915] general protection fault: 0000 [#1]
      [  940.026915] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core pppox ppp_generic slhc loop psmouse
      [  940.026915] CPU: 0 PID: 3184 Comm: memcheck-amd64- Not tainted 3.11.0+ #1
      [  940.026915] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [  940.026915] task: ffff880007130e20 ti: ffff88000737e000 task.ti: ffff88000737e000
      [  940.026915] RIP: 0010:[<ffffffff81333780>]  [<ffffffff81333780>] ip6_xmit+0x276/0x326
      [  940.026915] RSP: 0018:ffff88000737fd28  EFLAGS: 00010286
      [  940.026915] RAX: c748521a75ceff48 RBX: ffff880000c30800 RCX: 0000000000000000
      [  940.026915] RDX: ffff88000075cc4e RSI: 0000000000000028 RDI: ffff8800060e5a40
      [  940.026915] RBP: ffff8800060e5a40 R08: 0000000000000000 R09: ffff88000075cc90
      [  940.026915] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88000737fda0
      [  940.026915] R13: 0000000000000000 R14: 0000000000002000 R15: ffff880005d3b580
      [  940.026915] FS:  00007f163dc5e800(0000) GS:ffffffff81623000(0000) knlGS:0000000000000000
      [  940.026915] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  940.026915] CR2: 00000004032dc940 CR3: 0000000005c25000 CR4: 00000000000006f0
      [  940.026915] Stack:
      [  940.026915]  ffff88000075cc4e ffffffff81694e90 ffff880000c30b38 0000000000000020
      [  940.026915]  11000000523c4bac ffff88000737fdb4 0000000000000000 ffff880000c30800
      [  940.026915]  ffff880005d3b580 ffff880000c30b38 ffff8800060e5a40 0000000000000020
      [  940.026915] Call Trace:
      [  940.026915]  [<ffffffff81356cc3>] ? inet6_csk_xmit+0xa4/0xc4
      [  940.026915]  [<ffffffffa0038535>] ? l2tp_xmit_skb+0x503/0x55a [l2tp_core]
      [  940.026915]  [<ffffffff812b8d3b>] ? pskb_expand_head+0x161/0x214
      [  940.026915]  [<ffffffffa003e91d>] ? pppol2tp_xmit+0xf2/0x143 [l2tp_ppp]
      [  940.026915]  [<ffffffffa00292e0>] ? ppp_channel_push+0x36/0x8b [ppp_generic]
      [  940.026915]  [<ffffffffa00293fe>] ? ppp_write+0xaf/0xc5 [ppp_generic]
      [  940.026915]  [<ffffffff8110ead4>] ? vfs_write+0xa2/0x106
      [  940.026915]  [<ffffffff8110edd6>] ? SyS_write+0x56/0x8a
      [  940.026915]  [<ffffffff81378ac0>] ? system_call_fastpath+0x16/0x1b
      [  940.026915] Code: 00 49 8b 8f d8 00 00 00 66 83 7c 11 02 00 74 60 49
      8b 47 58 48 83 e0 fe 48 8b 80 18 01 00 00 48 85 c0 74 13 48 8b 80 78 02
      00 00 <48> ff 40 28 41 8b 57 68 48 01 50 30 48 8b 54 24 08 49 c7 c1 51
      [  940.026915] RIP  [<ffffffff81333780>] ip6_xmit+0x276/0x326
      [  940.026915]  RSP <ffff88000737fd28>
      [  940.057945] ---[ end trace be8aba9a61c8b7f3 ]---
      [  940.058583] Kernel panic - not syncing: Fatal exception in interrupt
      Signed-off-by: default avatarFrançois CACHEREUL <f.cachereul@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e18503f4
    • Eric Dumazet's avatar
      net: do not call sock_put() on TIMEWAIT sockets · 80ad1d61
      Eric Dumazet authored
      commit 3ab5aee7 ("net: Convert TCP & DCCP hash tables to use RCU /
      hlist_nulls") incorrectly used sock_put() on TIMEWAIT sockets.
      
      We should instead use inet_twsk_put()
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80ad1d61
    • Andy Gospodarek's avatar
      bonding: update MAINTAINERS · 28ad7b06
      Andy Gospodarek authored
      Veaceslav has been doing a significant amount of work on bonding lately and
      reached out to me about being a maintainer.  After discussing this with him, I
      think he would be a good fit as a bonding maintainer.
      Signed-off-by: default avatarAndy Gospodarek <andy@greyhouse.net>
      Acked-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: default avatarJay Vosburgh <fubar@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28ad7b06
    • stephen hemminger's avatar
      tc: export tc_defact.h to userspace · 5bc3db5c
      stephen hemminger authored
      Jamal sent patch to add tc user simple actions to iproute2
      but required header was not being exported.
      Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5bc3db5c
    • Andi Kleen's avatar
      tcp: Always set options to 0 before calling tcp_established_options · 5843ef42
      Andi Kleen authored
      tcp_established_options assumes opts->options is 0 before calling,
      as it read modify writes it.
      
      For the tcp_current_mss() case the opts structure is not zeroed,
      so this can be done with uninitialized values.
      
      This is ok, because ->options is not read in this path.
      But it's still better to avoid the operation on the uninitialized
      field. This shuts up a static code analyzer, and presumably
      may help the optimizer.
      
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5843ef42
    • Andi Kleen's avatar
      igb: Avoid uninitialized advertised variable in eee_set_cur · 58e4e1f6
      Andi Kleen authored
      eee_get_cur assumes that the output data is already zeroed. It can
      read-modify-write the advertised field:
      
                    if (ipcnfg & E1000_IPCNFG_EEE_100M_AN)
      2594			edata->advertised |= ADVERTISED_100baseT_Full;
      
      This is ok for the normal ethtool eee_get call, which always
      zeroes the input data before.
      
      But eee_set_cur also calls eee_get_cur and it did not zero the input
      field. Later on it then compares agsinst the field, which can contain partial
      stack garbage.
      
      Zero the input field in eee_set_cur() too.
      
      Cc: jeffrey.t.kirsher@intel.com
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58e4e1f6
    • David S. Miller's avatar
      Merge branch 'calxedaxgmac' · 52f77ba9
      David S. Miller authored
      Rob Herring says:
      
      ====================
      This is a couple of fixes related to xgmac_set_rx_mode. The changes are
      necessary for "bridge fdb add" to work correctly.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52f77ba9