1. 23 Mar, 2011 20 commits
  2. 22 Mar, 2011 19 commits
    • David S. Miller's avatar
    • Julian Anastasov's avatar
      ipv4: optimize route adding on secondary promotion · 04024b93
      Julian Anastasov authored
      Optimize the calling of fib_add_ifaddr for all
      secondary addresses after the promoted one to start from
      their place, not from the new place of the promoted
      secondary. It will save some CPU cycles because we
      are sure the promoted secondary was first for the subnet
      and all next secondaries do not change their place.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04024b93
    • Julian Anastasov's avatar
      ipv4: remove the routes on secondary promotion · 2d230e2b
      Julian Anastasov authored
      The secondary address promotion relies on fib_sync_down_addr
      to remove all routes created for the secondary addresses when
      the old primary address is deleted. It does not happen for cases
      when the primary address is also in another subnet. Fix that
      by deleting local and broadcast routes for all secondaries while
      they are on device list and by faking that all addresses from
      this subnet are to be deleted. It relies on fib_del_ifaddr being
      able to ignore the IPs from the concerned subnet while checking
      for duplication.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d230e2b
    • Julian Anastasov's avatar
      ipv4: fix route deletion for IPs on many subnets · e6abbaa2
      Julian Anastasov authored
      Alex Sidorenko reported for problems with local
      routes left after IP addresses are deleted. It happens
      when same IPs are used in more than one subnet for the
      device.
      
      	Fix fib_del_ifaddr to restrict the checks for duplicate
      local and broadcast addresses only to the IFAs that use
      our primary IFA or another primary IFA with same address.
      And we expect the prefsrc to be matched when the routes
      are deleted because it is possible they to differ only by
      prefsrc. This patch prevents local and broadcast routes
      to be leaked until their primary IP is deleted finally
      from the box.
      
      	As the secondary address promotion needs to delete
      the routes for all secondaries that used the old primary IFA,
      add option to ignore these secondaries from the checks and
      to assume they are already deleted, so that we can safely
      delete the route while these IFAs are still on the device list.
      Reported-by: default avatarAlex Sidorenko <alexandre.sidorenko@hp.com>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6abbaa2
    • Julian Anastasov's avatar
      ipv4: match prefsrc when deleting routes · 74cb3c10
      Julian Anastasov authored
      fib_table_delete forgets to match the routes by prefsrc.
      Callers can specify known IP in fc_prefsrc and we should remove
      the exact route. This is needed for cases when same local or
      broadcast addresses are used in different subnets and the
      routes differ only in prefsrc. All callers that do not provide
      fc_prefsrc will ignore the route prefsrc as before and will
      delete the first occurence. That is how the ip route del default
      magic works.
      
      	Current callers are:
      
      - ip_rt_ioctl where rtentry_to_fib_config provides fc_prefsrc only
      when the provided device name matches IP label with colon.
      
      - inet_rtm_delroute where RTA_PREFSRC is optional too
      
      - fib_magic which deals with routes when deleting addresses
      and where the fc_prefsrc is always set with the primary IP
      for the concerned IFA.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74cb3c10
    • Marc Zyngier's avatar
      NET: smsc95xx: don't use stack for async writes to the device · 3c0f3c60
      Marc Zyngier authored
      The set_multicast operation performs asynchronous writes to the
      device, with some addresses pointing to the stack. Bad things may
      happen, and this is trapped CONFIG_DMA_API_DEBUG:
      
      [    5.237762] WARNING: at /build/buildd/linux-linaro-omap-2.6.38/lib/dma-debug.c:867 check_for_stack+0xd4/0x100()
      [    5.237792] ehci-omap ehci-omap.0: DMA-API: device driver maps memory fromstack [addr=d9c77dec]
      [    5.237792] Modules linked in: smsc95xx(+) usbnet twl6030_usb twl4030_pwrbutton leds_gpio omap_wdt omap2_mcspi
      [    5.237854] [<c006d618>] (unwind_backtrace+0x0/0xf8) from [<c00a6a14>] (warn_slowpath_common+0x54/0x64)
      [    5.237884] [<c00a6a14>] (warn_slowpath_common+0x54/0x64) from [<c00a6ab8>] (warn_slowpath_fmt+0x30/0x40)
      [    5.237915] [<c00a6ab8>] (warn_slowpath_fmt+0x30/0x40) from [<c034e9d8>] (check_for_stack+0xd4/0x100)
      [    5.237915] [<c034e9d8>] (check_for_stack+0xd4/0x100) from [<c034fea8>] (debug_dma_map_page+0xb4/0xdc)
      [    5.237976] [<c034fea8>] (debug_dma_map_page+0xb4/0xdc) from [<c04242f0>] (map_urb_for_dma+0x26c/0x304)
      [    5.237976] [<c04242f0>] (map_urb_for_dma+0x26c/0x304) from [<c0424594>] (usb_hcd_submit_urb+0x78/0x19c)
      [    5.238037] [<c0424594>] (usb_hcd_submit_urb+0x78/0x19c) from [<bf049c5c>] (smsc95xx_write_reg_async+0xb4/0x130 [smsc95xx])
      [    5.238067] [<bf049c5c>] (smsc95xx_write_reg_async+0xb4/0x130 [smsc95xx]) from [<bf049dd4>] (smsc95xx_set_multicast+0xfc/0x148 [smsc95xx])
      [    5.238098] [<bf049dd4>] (smsc95xx_set_multicast+0xfc/0x148 [smsc95xx]) from [<bf04a118>] (smsc95xx_reset+0x2f8/0x68c [smsc95xx])
      [    5.238128] [<bf04a118>] (smsc95xx_reset+0x2f8/0x68c [smsc95xx]) from [<bf04a8cc>] (smsc95xx_bind+0xcc/0x188 [smsc95xx])
      [    5.238159] [<bf04a8cc>] (smsc95xx_bind+0xcc/0x188 [smsc95xx]) from [<bf03ef1c>] (usbnet_probe+0x204/0x4c4 [usbnet])
      [    5.238220] [<bf03ef1c>] (usbnet_probe+0x204/0x4c4 [usbnet]) from [<c0429078>] (usb_probe_interface+0xe4/0x1c4)
      [    5.238250] [<c0429078>] (usb_probe_interface+0xe4/0x1c4) from [<c03a8770>] (really_probe+0x64/0x160)
      [    5.238250] [<c03a8770>] (really_probe+0x64/0x160) from [<c03a8a30>] (driver_probe_device+0x48/0x60)
      [    5.238281] [<c03a8a30>] (driver_probe_device+0x48/0x60) from [<c03a8ad4>] (__driver_attach+0x8c/0x90)
      [    5.238311] [<c03a8ad4>] (__driver_attach+0x8c/0x90) from [<c03a7b24>] (bus_for_each_dev+0x50/0x7c)
      [    5.238311] [<c03a7b24>] (bus_for_each_dev+0x50/0x7c) from [<c03a82ec>] (bus_add_driver+0x190/0x250)
      [    5.238311] [<c03a82ec>] (bus_add_driver+0x190/0x250) from [<c03a8cf8>] (driver_register+0x78/0x13c)
      [    5.238433] [<c03a8cf8>] (driver_register+0x78/0x13c) from [<c0428040>] (usb_register_driver+0x78/0x13c)
      [    5.238464] [<c0428040>] (usb_register_driver+0x78/0x13c) from [<c005b680>] (do_one_initcall+0x34/0x188)
      [    5.238494] [<c005b680>] (do_one_initcall+0x34/0x188) from [<c00e11f0>] (sys_init_module+0xb0/0x1c0)
      [    5.238525] [<c00e11f0>] (sys_init_module+0xb0/0x1c0) from [<c0065c40>] (ret_fast_syscall+0x0/0x30)
      
      Move the two offenders to the private structure which is kmalloc-ed,
      and thus safe.
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Cc: Steve Glendinning <steve.glendinning@smsc.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c0f3c60
    • Michał Mirosław's avatar
      net: implement dev_disable_lro() hw_features compatibility · 27660515
      Michał Mirosław authored
      Implement compatibility with new hw_features for dev_disable_lro().
      This is a transition path - dev_disable_lro() should be later
      integrated into netdev_fix_features() after all drivers are converted.
      Signed-off-by: default avatarMichał Mirosław <mirq-linux@rere.qmqm.pl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27660515
    • Simon Horman's avatar
      IPVS: Use global mutex in ip_vs_app.c · 736561a0
      Simon Horman authored
      As part of the work to make IPVS network namespace aware
      __ip_vs_app_mutex was replaced by a per-namespace lock,
      ipvs->app_mutex. ipvs->app_key is also supplied for debugging purposes.
      
      Unfortunately this implementation results in ipvs->app_key residing
      in non-static storage which at the very least causes a lockdep warning.
      
      This patch takes the rather heavy-handed approach of reinstating
      __ip_vs_app_mutex which will cover access to the ipvs->list_head
      of all network namespaces.
      
      [   12.610000] IPVS: Creating netns size=2456 id=0
      [   12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
      [   12.640000] BUG: key ffff880003bbf1a0 not in .data!
      [   12.640000] ------------[ cut here ]------------
      [   12.640000] WARNING: at kernel/lockdep.c:2701 lockdep_init_map+0x37b/0x570()
      [   12.640000] Hardware name: Bochs
      [   12.640000] Pid: 1, comm: swapper Tainted: G        W 2.6.38-kexec-06330-g69b7efe-dirty #122
      [   12.650000] Call Trace:
      [   12.650000]  [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
      [   12.650000]  [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
      [   12.650000]  [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
      [   12.650000]  [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
      [   12.650000]  [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
      [   12.650000]  [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
      [   12.650000]  [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
      [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
      [   12.660000]  [<ffffffff811b1c33>] T.620+0x43/0x170
      [   12.660000]  [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
      [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
      [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
      [   12.660000]  [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
      [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
      [   12.670000]  [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
      [   12.670000]  [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
      [   12.670000]  [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
      [   12.670000]  [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
      [   12.670000]  [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
      [   12.670000]  [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
      [   12.670000]  [<ffffffff8128ad40>] ? restore_args+0x0/0x30
      [   12.680000]  [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
      [   12.680000]  [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x1global0
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Cc: Hans Schillstrom <hans@schillstrom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      736561a0
    • Eric Dumazet's avatar
      ipvs: fix a typo in __ip_vs_control_init() · f40f94fc
      Eric Dumazet authored
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Julian Anastasov <ja@ssi.bg>
      Acked-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f40f94fc
    • Eric W. Biederman's avatar
      veth: Fix the byte counters · 675071a2
      Eric W. Biederman authored
      Commit 44540960 "veth: move loopback logic to common location" introduced
      a bug in the packet counters.  I don't understand why that happened as it
      is not explained in the comments and the mut check in dev_forward_skb
      retains the assumption that skb->len is the total length of the packet.
      
      I just measured this emperically by setting up a veth pair between two
      noop network namespaces setting and attempting a telnet connection between
      the two.  I saw three packets in each direction and the byte counters were
      exactly 14*3 = 42 bytes high in each direction.  I got the actual
      packet lengths with tcpdump.
      
      So remove the extra ETH_HLEN from the veth byte count totals.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      675071a2
    • Eric W. Biederman's avatar
      net ipv6: Fix duplicate /proc/sys/net/ipv6/neigh directory entries. · 9d2a8fa9
      Eric W. Biederman authored
      When I was fixing issues with unregisgtering tables under /proc/sys/net/ipv6/neigh
      by adding a mount point it appears I missed a critical ordering issue, in the
      ipv6 initialization.  I had not realized that ipv6_sysctl_register is called
      at the very end of the ipv6 initialization and in particular after we call
      neigh_sysctl_register from ndisc_init.
      
      "neigh" needs to be initialized in ipv6_static_sysctl_register which is
      the first ipv6 table to initialized, and definitely before ndisc_init.
      This removes the weirdness of duplicate tables while still providing a
      "neigh" mount point which prevents races in sysctl unregistering.
      
      This was initially reported at https://bugzilla.kernel.org/show_bug.cgi?id=31232
      Reported-by: sunkan@zappa.cx
      Signed-off-by: default avatarEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d2a8fa9
    • Eric W. Biederman's avatar
      macvlan: Fix use after free of struct macvlan_port. · d5cd9244
      Eric W. Biederman authored
      When the macvlan driver was extended to call unregisgter_netdevice_queue
      in 23289a37, a use after free of struct
      macvlan_port was introduced.  The code in dellink relied on unregister_netdevice
      actually unregistering the net device so it would be safe to free macvlan_port.
      
      Since unregister_netdevice_queue can just queue up the unregister instead of
      performing the unregiser immediately we free the macvlan_port too soon and
      then the code in macvlan_stop removes the macaddress for the set of macaddress
      to listen for and uses memory that has already been freed.
      
      To fix this add a reference count to track when it is safe to free the macvlan_port
      and move the call of macvlan_port_destroy into macvlan_uninit which is guaranteed
      to be called after the final macvlan_port_close.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5cd9244
    • Neil Horman's avatar
      net: fix incorrect spelling in drop monitor protocol · ac0a121d
      Neil Horman authored
      It was pointed out to me recently that my spelling could be better :)
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac0a121d
    • Jan Altenberg's avatar
      can: c_can: Do basic c_can configuration _before_ enabling the interrupts · 4f2d56c4
      Jan Altenberg authored
      I ran into some trouble while testing the SocketCAN driver for the BOSCH
      C_CAN controller. The interface is not correctly initialized, if I put
      some CAN traffic on the line, _while_ the interface is being started
      (which means: the interface doesn't come up correcty, if there's some RX
      traffic while doing 'ifconfig can0 up').
      
      The current implementation enables the controller interrupts _before_
      doing the basic c_can configuration. I think, this should be done the
      other way round.
      
      The patch below fixes things for me.
      Signed-off-by: default avatarJan Altenberg <jan@linutronix.de>
      Acked-by: default avatarKurt Van Dijck <kurt.van.dijck@eia.be>
      Acked-by: default avatarWolfgang Grandegger <wg@grandegger.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f2d56c4
    • Arnd Bergmann's avatar
      net/appletalk: fix atalk_release use after free · b20e7bbf
      Arnd Bergmann authored
      The BKL removal in appletalk introduced a use-after-free problem,
      where atalk_destroy_socket frees a sock, but we still release
      the socket lock on it.
      
      An easy fix is to take an extra reference on the sock and sock_put
      it when returning from atalk_release.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b20e7bbf
    • Eric Dumazet's avatar
      ipx: fix ipx_release() · 674f2115
      Eric Dumazet authored
      Commit b0d0d915 (remove the BKL) added a regression, because
      sock_put() can free memory while we are going to use it later.
      
      Fix is to delay sock_put() _after_ release_sock().
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Tested-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      674f2115
    • Eric Dumazet's avatar
      snmp: SNMP_UPD_PO_STATS_BH() always called from softirq · 20246a80
      Eric Dumazet authored
      We dont need to test if we run from softirq context, we definitely are.
      
      This saves few instructions in ip_rcv() & ip_rcv_finish()
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20246a80
    • James Chapman's avatar
      l2tp: fix possible oops on l2tp_eth module unload · 8aa525a9
      James Chapman authored
      A struct used in the l2tp_eth driver for registering network namespace
      ops was incorrectly marked as __net_initdata, leading to oops when
      module unloaded.
      
      BUG: unable to handle kernel paging request at ffffffffa00ec098
      IP: [<ffffffff8123dbd8>] ops_exit_list+0x7/0x4b
      PGD 142d067 PUD 1431063 PMD 195da8067 PTE 0
      Oops: 0000 [#1] SMP 
      last sysfs file: /sys/module/l2tp_eth/refcnt
      Call Trace:
       [<ffffffff8123dc94>] ? unregister_pernet_operations+0x32/0x93
       [<ffffffff8123dd20>] ? unregister_pernet_device+0x2b/0x38
       [<ffffffff81068b6e>] ? sys_delete_module+0x1b8/0x222
       [<ffffffff810c7300>] ? do_munmap+0x254/0x318
       [<ffffffff812c64e5>] ? page_fault+0x25/0x30
       [<ffffffff812c6952>] ? system_call_fastpath+0x16/0x1b
      Signed-off-by: default avatarJames Chapman <jchapman@katalix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8aa525a9
    • Wei Yongjun's avatar
      xfrm: Fix initialize repl field of struct xfrm_state · a454f0cc
      Wei Yongjun authored
      Commit 'xfrm: Move IPsec replay detection functions to a separate file'
        (9fdc4883)
      introduce repl field to struct xfrm_state, and only initialize it
      under SA's netlink create path, the other path, such as pf_key,
      ipcomp/ipcomp6 etc, the repl field remaining uninitialize. So if
      the SA is created by pf_key, any input packet with SA's encryption
      algorithm will cause panic.
      
          int xfrm_input()
          {
              ...
              x->repl->advance(x, seq);
              ...
          }
      
      This patch fixed it by introduce new function __xfrm_init_state().
      
      Pid: 0, comm: swapper Not tainted 2.6.38-next+ #14 Bochs Bochs
      EIP: 0060:[<c078e5d5>] EFLAGS: 00010206 CPU: 0
      EIP is at xfrm_input+0x31c/0x4cc
      EAX: dd839c00 EBX: 00000084 ECX: 00000000 EDX: 01000000
      ESI: dd839c00 EDI: de3a0780 EBP: dec1de88 ESP: dec1de64
       DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
      Process swapper (pid: 0, ti=dec1c000 task=c09c0f20 task.ti=c0992000)
      Stack:
       00000000 00000000 00000002 c0ba27c0 00100000 01000000 de3a0798 c0ba27c0
       00000033 dec1de98 c0786848 00000000 de3a0780 dec1dea4 c0786868 00000000
       dec1debc c074ee56 e1da6b8c de3a0780 c074ed44 de3a07a8 dec1decc c074ef32
      Call Trace:
       [<c0786848>] xfrm4_rcv_encap+0x22/0x27
       [<c0786868>] xfrm4_rcv+0x1b/0x1d
       [<c074ee56>] ip_local_deliver_finish+0x112/0x1b1
       [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1
       [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44
       [<c074ef77>] ip_local_deliver+0x3e/0x44
       [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1
       [<c074ec03>] ip_rcv_finish+0x30a/0x332
       [<c074e8f9>] ? ip_rcv_finish+0x0/0x332
       [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44
       [<c074f188>] ip_rcv+0x20b/0x247
       [<c074e8f9>] ? ip_rcv_finish+0x0/0x332
       [<c072797d>] __netif_receive_skb+0x373/0x399
       [<c0727bc1>] netif_receive_skb+0x4b/0x51
       [<e0817e2a>] cp_rx_poll+0x210/0x2c4 [8139cp]
       [<c072818f>] net_rx_action+0x9a/0x17d
       [<c0445b5c>] __do_softirq+0xa1/0x149
       [<c0445abb>] ? __do_softirq+0x0/0x149
      Signed-off-by: default avatarWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a454f0cc
  3. 21 Mar, 2011 1 commit