1. 23 Mar, 2011 27 commits
  2. 22 Mar, 2011 13 commits
    • David S. Miller's avatar
    • Julian Anastasov's avatar
      ipv4: optimize route adding on secondary promotion · 04024b93
      Julian Anastasov authored
      Optimize the calling of fib_add_ifaddr for all
      secondary addresses after the promoted one to start from
      their place, not from the new place of the promoted
      secondary. It will save some CPU cycles because we
      are sure the promoted secondary was first for the subnet
      and all next secondaries do not change their place.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04024b93
    • Julian Anastasov's avatar
      ipv4: remove the routes on secondary promotion · 2d230e2b
      Julian Anastasov authored
      The secondary address promotion relies on fib_sync_down_addr
      to remove all routes created for the secondary addresses when
      the old primary address is deleted. It does not happen for cases
      when the primary address is also in another subnet. Fix that
      by deleting local and broadcast routes for all secondaries while
      they are on device list and by faking that all addresses from
      this subnet are to be deleted. It relies on fib_del_ifaddr being
      able to ignore the IPs from the concerned subnet while checking
      for duplication.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d230e2b
    • Julian Anastasov's avatar
      ipv4: fix route deletion for IPs on many subnets · e6abbaa2
      Julian Anastasov authored
      Alex Sidorenko reported for problems with local
      routes left after IP addresses are deleted. It happens
      when same IPs are used in more than one subnet for the
      device.
      
      	Fix fib_del_ifaddr to restrict the checks for duplicate
      local and broadcast addresses only to the IFAs that use
      our primary IFA or another primary IFA with same address.
      And we expect the prefsrc to be matched when the routes
      are deleted because it is possible they to differ only by
      prefsrc. This patch prevents local and broadcast routes
      to be leaked until their primary IP is deleted finally
      from the box.
      
      	As the secondary address promotion needs to delete
      the routes for all secondaries that used the old primary IFA,
      add option to ignore these secondaries from the checks and
      to assume they are already deleted, so that we can safely
      delete the route while these IFAs are still on the device list.
      Reported-by: default avatarAlex Sidorenko <alexandre.sidorenko@hp.com>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6abbaa2
    • Julian Anastasov's avatar
      ipv4: match prefsrc when deleting routes · 74cb3c10
      Julian Anastasov authored
      fib_table_delete forgets to match the routes by prefsrc.
      Callers can specify known IP in fc_prefsrc and we should remove
      the exact route. This is needed for cases when same local or
      broadcast addresses are used in different subnets and the
      routes differ only in prefsrc. All callers that do not provide
      fc_prefsrc will ignore the route prefsrc as before and will
      delete the first occurence. That is how the ip route del default
      magic works.
      
      	Current callers are:
      
      - ip_rt_ioctl where rtentry_to_fib_config provides fc_prefsrc only
      when the provided device name matches IP label with colon.
      
      - inet_rtm_delroute where RTA_PREFSRC is optional too
      
      - fib_magic which deals with routes when deleting addresses
      and where the fc_prefsrc is always set with the primary IP
      for the concerned IFA.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74cb3c10
    • Marc Zyngier's avatar
      NET: smsc95xx: don't use stack for async writes to the device · 3c0f3c60
      Marc Zyngier authored
      The set_multicast operation performs asynchronous writes to the
      device, with some addresses pointing to the stack. Bad things may
      happen, and this is trapped CONFIG_DMA_API_DEBUG:
      
      [    5.237762] WARNING: at /build/buildd/linux-linaro-omap-2.6.38/lib/dma-debug.c:867 check_for_stack+0xd4/0x100()
      [    5.237792] ehci-omap ehci-omap.0: DMA-API: device driver maps memory fromstack [addr=d9c77dec]
      [    5.237792] Modules linked in: smsc95xx(+) usbnet twl6030_usb twl4030_pwrbutton leds_gpio omap_wdt omap2_mcspi
      [    5.237854] [<c006d618>] (unwind_backtrace+0x0/0xf8) from [<c00a6a14>] (warn_slowpath_common+0x54/0x64)
      [    5.237884] [<c00a6a14>] (warn_slowpath_common+0x54/0x64) from [<c00a6ab8>] (warn_slowpath_fmt+0x30/0x40)
      [    5.237915] [<c00a6ab8>] (warn_slowpath_fmt+0x30/0x40) from [<c034e9d8>] (check_for_stack+0xd4/0x100)
      [    5.237915] [<c034e9d8>] (check_for_stack+0xd4/0x100) from [<c034fea8>] (debug_dma_map_page+0xb4/0xdc)
      [    5.237976] [<c034fea8>] (debug_dma_map_page+0xb4/0xdc) from [<c04242f0>] (map_urb_for_dma+0x26c/0x304)
      [    5.237976] [<c04242f0>] (map_urb_for_dma+0x26c/0x304) from [<c0424594>] (usb_hcd_submit_urb+0x78/0x19c)
      [    5.238037] [<c0424594>] (usb_hcd_submit_urb+0x78/0x19c) from [<bf049c5c>] (smsc95xx_write_reg_async+0xb4/0x130 [smsc95xx])
      [    5.238067] [<bf049c5c>] (smsc95xx_write_reg_async+0xb4/0x130 [smsc95xx]) from [<bf049dd4>] (smsc95xx_set_multicast+0xfc/0x148 [smsc95xx])
      [    5.238098] [<bf049dd4>] (smsc95xx_set_multicast+0xfc/0x148 [smsc95xx]) from [<bf04a118>] (smsc95xx_reset+0x2f8/0x68c [smsc95xx])
      [    5.238128] [<bf04a118>] (smsc95xx_reset+0x2f8/0x68c [smsc95xx]) from [<bf04a8cc>] (smsc95xx_bind+0xcc/0x188 [smsc95xx])
      [    5.238159] [<bf04a8cc>] (smsc95xx_bind+0xcc/0x188 [smsc95xx]) from [<bf03ef1c>] (usbnet_probe+0x204/0x4c4 [usbnet])
      [    5.238220] [<bf03ef1c>] (usbnet_probe+0x204/0x4c4 [usbnet]) from [<c0429078>] (usb_probe_interface+0xe4/0x1c4)
      [    5.238250] [<c0429078>] (usb_probe_interface+0xe4/0x1c4) from [<c03a8770>] (really_probe+0x64/0x160)
      [    5.238250] [<c03a8770>] (really_probe+0x64/0x160) from [<c03a8a30>] (driver_probe_device+0x48/0x60)
      [    5.238281] [<c03a8a30>] (driver_probe_device+0x48/0x60) from [<c03a8ad4>] (__driver_attach+0x8c/0x90)
      [    5.238311] [<c03a8ad4>] (__driver_attach+0x8c/0x90) from [<c03a7b24>] (bus_for_each_dev+0x50/0x7c)
      [    5.238311] [<c03a7b24>] (bus_for_each_dev+0x50/0x7c) from [<c03a82ec>] (bus_add_driver+0x190/0x250)
      [    5.238311] [<c03a82ec>] (bus_add_driver+0x190/0x250) from [<c03a8cf8>] (driver_register+0x78/0x13c)
      [    5.238433] [<c03a8cf8>] (driver_register+0x78/0x13c) from [<c0428040>] (usb_register_driver+0x78/0x13c)
      [    5.238464] [<c0428040>] (usb_register_driver+0x78/0x13c) from [<c005b680>] (do_one_initcall+0x34/0x188)
      [    5.238494] [<c005b680>] (do_one_initcall+0x34/0x188) from [<c00e11f0>] (sys_init_module+0xb0/0x1c0)
      [    5.238525] [<c00e11f0>] (sys_init_module+0xb0/0x1c0) from [<c0065c40>] (ret_fast_syscall+0x0/0x30)
      
      Move the two offenders to the private structure which is kmalloc-ed,
      and thus safe.
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Cc: Steve Glendinning <steve.glendinning@smsc.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c0f3c60
    • Michał Mirosław's avatar
      net: implement dev_disable_lro() hw_features compatibility · 27660515
      Michał Mirosław authored
      Implement compatibility with new hw_features for dev_disable_lro().
      This is a transition path - dev_disable_lro() should be later
      integrated into netdev_fix_features() after all drivers are converted.
      Signed-off-by: default avatarMichał Mirosław <mirq-linux@rere.qmqm.pl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27660515
    • Simon Horman's avatar
      IPVS: Use global mutex in ip_vs_app.c · 736561a0
      Simon Horman authored
      As part of the work to make IPVS network namespace aware
      __ip_vs_app_mutex was replaced by a per-namespace lock,
      ipvs->app_mutex. ipvs->app_key is also supplied for debugging purposes.
      
      Unfortunately this implementation results in ipvs->app_key residing
      in non-static storage which at the very least causes a lockdep warning.
      
      This patch takes the rather heavy-handed approach of reinstating
      __ip_vs_app_mutex which will cover access to the ipvs->list_head
      of all network namespaces.
      
      [   12.610000] IPVS: Creating netns size=2456 id=0
      [   12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
      [   12.640000] BUG: key ffff880003bbf1a0 not in .data!
      [   12.640000] ------------[ cut here ]------------
      [   12.640000] WARNING: at kernel/lockdep.c:2701 lockdep_init_map+0x37b/0x570()
      [   12.640000] Hardware name: Bochs
      [   12.640000] Pid: 1, comm: swapper Tainted: G        W 2.6.38-kexec-06330-g69b7efe-dirty #122
      [   12.650000] Call Trace:
      [   12.650000]  [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
      [   12.650000]  [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
      [   12.650000]  [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
      [   12.650000]  [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
      [   12.650000]  [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
      [   12.650000]  [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
      [   12.650000]  [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
      [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
      [   12.660000]  [<ffffffff811b1c33>] T.620+0x43/0x170
      [   12.660000]  [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
      [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
      [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
      [   12.660000]  [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
      [   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
      [   12.670000]  [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
      [   12.670000]  [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
      [   12.670000]  [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
      [   12.670000]  [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
      [   12.670000]  [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
      [   12.670000]  [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
      [   12.670000]  [<ffffffff8128ad40>] ? restore_args+0x0/0x30
      [   12.680000]  [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
      [   12.680000]  [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x1global0
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Julian Anastasov <ja@ssi.bg>
      Cc: Hans Schillstrom <hans@schillstrom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      736561a0
    • Eric Dumazet's avatar
      ipvs: fix a typo in __ip_vs_control_init() · f40f94fc
      Eric Dumazet authored
      Reported-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Cc: Simon Horman <horms@verge.net.au>
      Cc: Julian Anastasov <ja@ssi.bg>
      Acked-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f40f94fc
    • Eric W. Biederman's avatar
      veth: Fix the byte counters · 675071a2
      Eric W. Biederman authored
      Commit 44540960 "veth: move loopback logic to common location" introduced
      a bug in the packet counters.  I don't understand why that happened as it
      is not explained in the comments and the mut check in dev_forward_skb
      retains the assumption that skb->len is the total length of the packet.
      
      I just measured this emperically by setting up a veth pair between two
      noop network namespaces setting and attempting a telnet connection between
      the two.  I saw three packets in each direction and the byte counters were
      exactly 14*3 = 42 bytes high in each direction.  I got the actual
      packet lengths with tcpdump.
      
      So remove the extra ETH_HLEN from the veth byte count totals.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      675071a2
    • Eric W. Biederman's avatar
      net ipv6: Fix duplicate /proc/sys/net/ipv6/neigh directory entries. · 9d2a8fa9
      Eric W. Biederman authored
      When I was fixing issues with unregisgtering tables under /proc/sys/net/ipv6/neigh
      by adding a mount point it appears I missed a critical ordering issue, in the
      ipv6 initialization.  I had not realized that ipv6_sysctl_register is called
      at the very end of the ipv6 initialization and in particular after we call
      neigh_sysctl_register from ndisc_init.
      
      "neigh" needs to be initialized in ipv6_static_sysctl_register which is
      the first ipv6 table to initialized, and definitely before ndisc_init.
      This removes the weirdness of duplicate tables while still providing a
      "neigh" mount point which prevents races in sysctl unregistering.
      
      This was initially reported at https://bugzilla.kernel.org/show_bug.cgi?id=31232
      Reported-by: sunkan@zappa.cx
      Signed-off-by: default avatarEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d2a8fa9
    • Eric W. Biederman's avatar
      macvlan: Fix use after free of struct macvlan_port. · d5cd9244
      Eric W. Biederman authored
      When the macvlan driver was extended to call unregisgter_netdevice_queue
      in 23289a37, a use after free of struct
      macvlan_port was introduced.  The code in dellink relied on unregister_netdevice
      actually unregistering the net device so it would be safe to free macvlan_port.
      
      Since unregister_netdevice_queue can just queue up the unregister instead of
      performing the unregiser immediately we free the macvlan_port too soon and
      then the code in macvlan_stop removes the macaddress for the set of macaddress
      to listen for and uses memory that has already been freed.
      
      To fix this add a reference count to track when it is safe to free the macvlan_port
      and move the call of macvlan_port_destroy into macvlan_uninit which is guaranteed
      to be called after the final macvlan_port_close.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@aristanetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5cd9244
    • Neil Horman's avatar
      net: fix incorrect spelling in drop monitor protocol · ac0a121d
      Neil Horman authored
      It was pointed out to me recently that my spelling could be better :)
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac0a121d