1. 28 Jan, 2013 2 commits
    • Neerav Parikh's avatar
      fcoe: Fix deadlock while deleting FCoE interface with NPIV ports · 94aa743a
      Neerav Parikh authored
      This patch fixes following deadlock caused by destroying of
      an FCoE interface with active NPIV ports on that interface.
      
          Call Trace:
          [<ffffffff814b7e88>] schedule+0x64/0x66
          [<ffffffff814b6b4f>] schedule_timeout+0x36/0xe3
          [<ffffffff81070c55>] ? update_curr+0xd6/0x110
          [<ffffffff81071f6b>] ? hrtick_update+0x1b/0x4d
          [<ffffffff81072405>] ? dequeue_task_fair+0x1ca/0x1d9
          [<ffffffff8106a369>] ? need_resched+0x1e/0x28
          [<ffffffff814b7d14>] wait_for_common+0x9b/0xf1
          [<ffffffff8106e7be>] ? try_to_wake_up+0x1e0/0x1e0
          [<ffffffff814b7e22>] wait_for_completion+0x1d/0x1f
          [<ffffffff8105ae82>] flush_workqueue+0x116/0x2a1
          [<ffffffff8105b357>] drain_workqueue+0x66/0x14c
          [<ffffffff8105b8ef>] destroy_workqueue+0x1a/0xcf
          [<ffffffffa009211e>] fc_remove_host+0x154/0x17f [scsi_transport_fc]
          [<ffffffffa00edbb8>] fcoe_if_destroy+0x184/0x1c9 [fcoe]
          [<ffffffffa00edc28>] fcoe_destroy_work+0x2b/0x44 [fcoe]
          [<ffffffff8105a82a>] process_one_work+0x1a8/0x2a4
          [<ffffffffa00edbfd>] ? fcoe_if_destroy+0x1c9/0x1c9 [fcoe]
          [<ffffffff8105c396>] worker_thread+0x1db/0x268
          [<ffffffff810604a3>] ? wake_up_bit+0x2a/0x2a
          [<ffffffff8105c1bb>] ? manage_workers.clone.16+0x1f6/0x1f6
          [<ffffffff8105ffd6>] kthread+0x6f/0x77
          [<ffffffff814c0304>] kernel_thread_helper+0x4/0x10
          [<ffffffff8105ff67>] ? kthread_freezable_should_stop+0x4b/0x4b
      
          Call Trace:
          [<ffffffff814b7e88>] schedule+0x64/0x66
          [<ffffffff814b8041>] schedule_preempt_disabled+0xe/0x10
          [<ffffffff814b70a1>] __mutex_lock_common.clone.5+0x117/0x17a
          [<ffffffff814b7117>] __mutex_lock_slowpath+0x13/0x15
          [<ffffffff814b6f76>] mutex_lock+0x23/0x37
          [<ffffffff8125b890>] ? list_del+0x11/0x30
          [<ffffffffa00edc84>] fcoe_vport_destroy+0x43/0x5f [fcoe]
          [<ffffffffa009130a>] fc_vport_terminate+0x48/0x110 [scsi_transport_fc]
          [<ffffffffa00913ef>] fc_vport_sched_delete+0x1d/0x79 [scsi_transport_fc]
          [<ffffffff8105a82a>] process_one_work+0x1a8/0x2a4
          [<ffffffffa00913d2>] ? fc_vport_terminate+0x110/0x110 [scsi_transport_fc]
          [<ffffffff8105c396>] worker_thread+0x1db/0x268
          [<ffffffff8105c1bb>] ? manage_workers.clone.16+0x1f6/0x1f6
          [<ffffffff8105ffd6>] kthread+0x6f/0x77
          [<ffffffff814c0304>] kernel_thread_helper+0x4/0x10
          [<ffffffff8105ff67>] ? kthread_freezable_should_stop+0x4b/0x4b
          [<ffffffff814c0300>] ? gs_change+0x13/0x13
      
      A prior attempt to fix this issue is posted here:
      http://lists.open-fcoe.org/pipermail/devel/2012-October/012318.html
      or
      http://article.gmane.org/gmane.linux.scsi.open-fcoe.devel/11924
      
      Based on feedback and discussion with Neil Horman it seems that the above patch
      may have a case where the fcoe_vport_destroy() and fcoe_destroy_work() can
      race; hence that patch has been withdrawn with this patch that is trying to
      solve the same problem in a different way.
      
      In the current approach instead of removing the fcoe_config_mutex from the
      vport_delete callback function; I've chosen to delete all the NPIV ports first
      on a given root lport before continuing with the removal of the root lport.
      Signed-off-by: default avatarNeerav Parikh <Neerav.Parikh@intel.com>
      Tested-by: default avatarMarcus Dennis <marcusx.e.dennis@intel.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarRobert Love <robert.w.love@intel.com>
      94aa743a
    • Neil Horman's avatar
      fcoe: close race on link speed detection in fcoe code · f9184df3
      Neil Horman authored
      When creating an fcoe interfce, we call fcoe_link_speed_update before we add the
      lports fcoe interface to the fc_hostlist.  Since network device events like
      NETDEV_CHANGE are only processed if an fcoe interface is found with an
      underlying netdev that matches the netdev of the event.  Since this processing
      in fcoe_device_notification is how link_speed changes get communicated to the
      libfc  code (via fcoe_link_speed_update), we have a race condition - if a
      NETDEV_CHANGE event is sent after the call to fcoe_link_speed_update in
      fcoe_netdev_config, but before we add the interface to the fc_hostlist, we will
      loose the event and attributes like /sys/class/fc_host/hostX/speed will not get
      updated properly.
      
      Fix this by moving the add to the fc_hostlist above the serialized call to
      fcoe_netdev_config, ensuring that we catch netdev envents before we make a
      direct call to fcoe_link_speed_update.
      
      Also use this opportunity to clean up access to the fc_hostlist a bit by
      creating a fcoe_hostlist_del accessor and replacing the cleanup in fcoe_exit to
      use it properly.
      
      Tested by myself successfully
      
      [ Comment over 80 chars broken into multi-line by Robert Love to
        satisfy checkpatch.pl ]
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: default avatarYi Zou <yi.zou@intel.com>
      Signed-off-by: default avatarRobert Love <robert.w.love@intel.com>
      f9184df3
  2. 07 Jan, 2013 1 commit
  3. 14 Dec, 2012 13 commits
  4. 04 Dec, 2012 3 commits
  5. 03 Dec, 2012 8 commits
    • Linus Torvalds's avatar
      Linux 3.7-rc8 · b69f0859
      Linus Torvalds authored
      b69f0859
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac · b52c6402
      Linus Torvalds authored
      Pull EDAC fixes from Mauro Carvalho Chehab:
       "One EDAC core fix, and a few driver fixes (i7300, i9275x, i7core)."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
        i7core_edac: fix panic when accessing sysfs files
        i7300_edac: Fix error flag testing
        edac: Fix the dimm filling for csrows-based layouts
        i82975x_edac: Fix dimm label initialization
      b52c6402
    • Linus Torvalds's avatar
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 4ba00329
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "Some driver fixes for s5p/exynos (mostly race fixes)"
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        [media] s5p-mfc: Handle multi-frame input buffer
        [media] s5p-mfc: Bug fix of timestamp/timecode copy mechanism
        [media] exynos-gsc: Add missing video device vfl_dir flag initialization
        [media] exynos-gsc: Fix settings for input and output image RGB type
        [media] exynos-gsc: Don't use mutex_lock_interruptible() in device release()
        [media] fimc-lite: Don't use mutex_lock_interruptible() in device release()
        [media] s5p-fimc: Don't use mutex_lock_interruptible() in device release()
        [media] s5p-fimc: Prevent race conditions during subdevs registration
      4ba00329
    • Al Viro's avatar
      [parisc] open(2) compat bug · 25a3bc6b
      Al Viro authored
      In commit 9d73fc2d ("open*(2) compat fixes (s390, arm64)") I said:
      >
      > 	The usual rules for open()/openat()/open_by_handle_at() are
      > 1) native 32bit - don't force O_LARGEFILE in flags
      > 2) native 64bit - force O_LARGEFILE in flags
      > 3) compat on 64bit host - as for native 32bit
      > 4) native 32bit ABI for 64bit system (mips/n32, x86/x32) - as for native 64bit
      >
      > There are only two exceptions - s390 compat has open() forcing O_LARGEFILE and
      > arm64 compat has open_by_handle_at() doing the same thing.  The same binaries
      > on native host (s390/31 and arm resp.) will *not* force O_LARGEFILE, so IMO
      > both are emulation bugs.
      
      Three exceptions, actually - parisc open() is another case like that.
      Native 32bit won't force O_LARGEFILE, the same binary on parisc64 will.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      25a3bc6b
    • Mike Galbraith's avatar
      Revert "sched, autogroup: Stop going ahead if autogroup is disabled" · fd8ef117
      Mike Galbraith authored
      This reverts commit 800d4d30.
      
      Between commits 8323f26c ("sched: Fix race in task_group()") and
      800d4d30 ("sched, autogroup: Stop going ahead if autogroup is
      disabled"), autogroup is a wreck.
      
      With both applied, all you have to do to crash a box is disable
      autogroup during boot up, then reboot..  boom, NULL pointer dereference
      due to commit 800d4d30 not allowing autogroup to move things, and
      commit 8323f26c making that the only way to switch runqueues:
      
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: [<ffffffff81063ac0>] effective_load.isra.43+0x50/0x90
        Pid: 7047, comm: systemd-user-se Not tainted 3.6.8-smp #7 MEDIONPC MS-7502/MS-7502
        RIP: effective_load.isra.43+0x50/0x90
        Process systemd-user-se (pid: 7047, threadinfo ffff880221dde000, task ffff88022618b3a0)
        Call Trace:
          select_task_rq_fair+0x255/0x780
          try_to_wake_up+0x156/0x2c0
          wake_up_state+0xb/0x10
          signal_wake_up+0x28/0x40
          complete_signal+0x1d6/0x250
          __send_signal+0x170/0x310
          send_signal+0x40/0x80
          do_send_sig_info+0x47/0x90
          group_send_sig_info+0x4a/0x70
          kill_pid_info+0x3a/0x60
          sys_kill+0x97/0x1a0
          ? vfs_read+0x120/0x160
          ? sys_read+0x45/0x90
          system_call_fastpath+0x16/0x1b
        Code: 49 0f af 41 50 31 d2 49 f7 f0 48 83 f8 01 48 0f 46 c6 48 2b 07 48 8b bf 40 01 00 00 48 85 ff 74 3a 45 31 c0 48 8b 8f 50 01 00 00 <48> 8b 11 4c 8b 89 80 00 00 00 49 89 d2 48 01 d0 45 8b 59 58 4c
        RIP  [<ffffffff81063ac0>] effective_load.isra.43+0x50/0x90
         RSP <ffff880221ddfbd8>
        CR2: 0000000000000000
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Yong Zhang <yong.zhang0@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: stable@vger.kernel.org # 2.6.39+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fd8ef117
    • Linus Torvalds's avatar
      Merge branch 'block-dev' · d3594ea2
      Linus Torvalds authored
      Merge 'block-dev' branch.
      
      I was going to just mark everything here for stable and leave it to the
      3.8 merge window, but having decided on doing another -rc, I migth as
      well merge it now.
      
      This removes the bd_block_size_semaphore semaphore that was added in
      this release to fix a race condition between block size changes and
      block IO, and replaces it with atomicity guaratees in fs/buffer.c
      instead, along with simplifying fs/block-dev.c.
      
      This removes more lines than it adds, makes the code generally simpler,
      and avoids the latency/rt issues that the block size semaphore
      introduced for mount.
      
      I'm not happy with the timing, but it wouldn't be much better doing this
      during the merge window and then having some delayed back-port of it
      into stable.
      
      * block-dev:
        blkdev_max_block: make private to fs/buffer.c
        direct-io: don't read inode->i_blkbits multiple times
        blockdev: remove bd_block_size_semaphore again
        fs/buffer.c: make block-size be per-page and protected by the page lock
      d3594ea2
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 7e5530af
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) 8139cp leaks memory in error paths, from Francois Romieu.
      
       2) do_tcp_sendpages() cannot handle order > 0 pages, but they can
          certainly arrive there now, fix from Eric Dumazet.
      
       3) Race condition and sysfs fixes in bonding from Nikolay Aleksandrov.
      
       4) Remain-on-Channel fix in mac80211 from Felix Liao.
      
       5) CCK rate calculation fix in iwlwifi, from Emmanuel Grumbach.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        8139cp: fix coherent mapping leak in error path.
        tcp: fix crashes in do_tcp_sendpages()
        bonding: fix race condition in bonding_store_slaves_active
        bonding: make arp_ip_target parameter checks consistent with sysfs
        bonding: fix miimon and arp_interval delayed work race conditions
        mac80211: fix remain-on-channel (non-)cancelling
        iwlwifi: fix the basic CCK rates calculation
      7e5530af
    • Linus Torvalds's avatar
      Merge tag 'md-3.7-fixes' of git://neil.brown.name/md · 4ccc8045
      Linus Torvalds authored
      Pull md bugfix from NeilBrown:
       "Single bugfix for raid1/raid10.
      
        Fixes a recently introduced deadlock."
      
      * tag 'md-3.7-fixes' of git://neil.brown.name/md:
        md/raid1{,0}: fix deadlock in bitmap_unplug.
      4ccc8045
  6. 02 Dec, 2012 5 commits
    • Al Viro's avatar
      open*(2) compat fixes (s390, arm64) · 9d73fc2d
      Al Viro authored
      The usual rules for open()/openat()/open_by_handle_at() are
       1) native 32bit - don't force O_LARGEFILE in flags
       2) native 64bit - force O_LARGEFILE in flags
       3) compat on 64bit host - as for native 32bit
       4) native 32bit ABI for 64bit system (mips/n32, x86/x32) - as for
          native 64bit
      
      There are only two exceptions - s390 compat has open() forcing
      O_LARGEFILE and arm64 compat has open_by_handle_at() doing the same
      thing.  The same binaries on native host (s390/31 and arm resp.) will
      *not* force O_LARGEFILE, so IMO both are emulation bugs.
      
      Objections? The fix is obvious...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d73fc2d
    • Linus Torvalds's avatar
      Merge branch 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 3c46f3d6
      Linus Torvalds authored
      Pull  late workqueue fixes from Tejun Heo:
       "Unfortunately, I have two really late fixes.  One was for a
        long-standing bug and queued for 3.8 but I found out about a
        regression introduced during 3.7-rc1 two days ago, so I'm sending out
        the two fixes together.
      
        The first (long-standing) one is rescuer_thread() entering exit path
        w/ TASK_INTERRUPTIBLE.  It only triggers on workqueue destructions
        which isn't very frequent and the exit path can usually survive being
        called with TASK_INTERRUPT, so it was hidden pretty well.  Apparently,
        if you're reiserfs, this could lead to the exiting kthread sleeping
        indefinitely holding a mutex, which is never good.
      
        The fix is simple - restoring TASK_RUNNING before returning from the
        kthread function.
      
        The second one is introduced by the new mod_delayed_work().
        mod_delayed_work() was missing special case handling for 0 delay.
        Instead of queueing the work item immediately, it queued the timer
        which expires on the closest next tick.  Some users of the new
        function converted from "[__]cancel_delayed_work() +
        queue_delayed_work()" combination became unhappy with the extra delay.
      
        Block unplugging led to noticeably higher number of context switches
        and intel 6250 wireless failed to associate with WPA-Enterprise
        network.  The fix, again, is fairly simple.  The 0 delay special case
        logic from queue_delayed_work_on() should be moved to
        __queue_delayed_work() which is shared by both queue_delayed_work_on()
        and mod_delayed_work_on().
      
        The first one is difficult to trigger and the failure mode for the
        latter isn't completely catastrophic, so missing these two for 3.7
        wouldn't make it a disastrous release, but both bugs are nasty and the
        fixes are fairly safe"
      
      * 'for-3.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay
        workqueue: exit rescuer_thread() as TASK_RUNNING
      3c46f3d6
    • françois romieu's avatar
      8139cp: fix coherent mapping leak in error path. · 892a925e
      françois romieu authored
      cp_open
      [...]
              rc = cp_alloc_rings(cp);
              if (rc)
                      return rc;
      
      cp_alloc_rings
      [...]
              mem = dma_alloc_coherent(&cp->pdev->dev, CP_RING_BYTES,
                                       &cp->ring_dma, GFP_KERNEL);
      
      - cp_alloc_rings never frees the coherent mapping it allocates
      - neither do cp_open when cp_alloc_rings fails
      Signed-off-by: default avatarFrancois Romieu <romieu@fr.zoreil.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      892a925e
    • Eric Dumazet's avatar
      tcp: fix crashes in do_tcp_sendpages() · 64022d0b
      Eric Dumazet authored
      Recent network changes allowed high order pages being used
      for skb fragments.
      
      This uncovered a bug in do_tcp_sendpages() which was assuming its caller
      provided an array of order-0 page pointers.
      
      We only have to deal with a single page in this function, and its order
      is irrelevant.
      Reported-by: default avatarWilly Tarreau <w@1wt.eu>
      Tested-by: default avatarWilly Tarreau <w@1wt.eu>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      64022d0b
    • Tejun Heo's avatar
      workqueue: mod_delayed_work_on() shouldn't queue timer on 0 delay · 8852aac2
      Tejun Heo authored
      8376fe22 ("workqueue: implement mod_delayed_work[_on]()")
      implemented mod_delayed_work[_on]() using the improved
      try_to_grab_pending().  The function is later used, among others, to
      replace [__]candel_delayed_work() + queue_delayed_work() combinations.
      
      Unfortunately, a delayed_work item w/ zero @delay is handled slightly
      differently by mod_delayed_work_on() compared to
      queue_delayed_work_on().  The latter skips timer altogether and
      directly queues it using queue_work_on() while the former schedules
      timer which will expire on the closest tick.  This means, when @delay
      is zero, that [__]cancel_delayed_work() + queue_delayed_work_on()
      makes the target item immediately executable while
      mod_delayed_work_on() may induce delay of upto a full tick.
      
      This somewhat subtle difference breaks some of the converted users.
      e.g. block queue plugging uses delayed_work for deferred processing
      and uses mod_delayed_work_on() when the queue needs to be immediately
      unplugged.  The above problem manifested as noticeably higher number
      of context switches under certain circumstances.
      
      The difference in behavior was caused by missing special case handling
      for 0 delay in mod_delayed_work_on() compared to
      queue_delayed_work_on().  Joonsoo Kim posted a patch to add it -
      ("workqueue: optimize mod_delayed_work_on() when @delay == 0")[1].
      The patch was queued for 3.8 but it was described as optimization and
      I missed that it was a correctness issue.
      
      As both queue_delayed_work_on() and mod_delayed_work_on() use
      __queue_delayed_work() for queueing, it seems that the better approach
      is to move the 0 delay special handling to the function instead of
      duplicating it in mod_delayed_work_on().
      
      Fix the problem by moving 0 delay special case handling from
      queue_delayed_work_on() to __queue_delayed_work().  This replaces
      Joonsoo's patch.
      
      [1] http://thread.gmane.org/gmane.linux.kernel/1379011/focus=1379012Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-and-tested-by: default avatarAnders Kaseorg <andersk@MIT.EDU>
      Reported-and-tested-by: default avatarZlatko Calusic <zlatko.calusic@iskon.hr>
      LKML-Reference: <alpine.DEB.2.00.1211280953350.26602@dr-wily.mit.edu>
      LKML-Reference: <50A78AA9.5040904@iskon.hr>
      Cc: Joonsoo Kim <js1304@gmail.com>
      8852aac2
  7. 01 Dec, 2012 8 commits