1. 30 Apr, 2014 40 commits
    • Mizuma, Masayoshi's avatar
      mm: hugetlb: fix softlockup when a large number of hugepages are freed. · 057d8772
      Mizuma, Masayoshi authored
      commit 55f67141 upstream.
      
      When I decrease the value of nr_hugepage in procfs a lot, softlockup
      happens.  It is because there is no chance of context switch during this
      process.
      
      On the other hand, when I allocate a large number of hugepages, there is
      some chance of context switch.  Hence softlockup doesn't happen during
      this process.  So it's necessary to add the context switch in the
      freeing process as same as allocating process to avoid softlockup.
      
      When I freed 12 TB hugapages with kernel-2.6.32-358.el6, the freeing
      process occupied a CPU over 150 seconds and following softlockup message
      appeared twice or more.
      
      $ echo 6000000 > /proc/sys/vm/nr_hugepages
      $ cat /proc/sys/vm/nr_hugepages
      6000000
      $ grep ^Huge /proc/meminfo
      HugePages_Total:   6000000
      HugePages_Free:    6000000
      HugePages_Rsvd:        0
      HugePages_Surp:        0
      Hugepagesize:       2048 kB
      $ echo 0 > /proc/sys/vm/nr_hugepages
      
      BUG: soft lockup - CPU#16 stuck for 67s! [sh:12883] ...
      Pid: 12883, comm: sh Not tainted 2.6.32-358.el6.x86_64 #1
      Call Trace:
        free_pool_huge_page+0xb8/0xd0
        set_max_huge_pages+0x128/0x190
        hugetlb_sysctl_handler_common+0x113/0x140
        hugetlb_sysctl_handler+0x1e/0x20
        proc_sys_call_handler+0x97/0xd0
        proc_sys_write+0x14/0x20
        vfs_write+0xb8/0x1a0
        sys_write+0x51/0x90
        __audit_syscall_exit+0x265/0x290
        system_call_fastpath+0x16/0x1b
      
      I have not confirmed this problem with upstream kernels because I am not
      able to prepare the machine equipped with 12TB memory now.  However I
      confirmed that the amount of decreasing hugepages was directly
      proportional to the amount of required time.
      
      I measured required times on a smaller machine.  It showed 130-145
      hugepages decreased in a millisecond.
      
        Amount of decreasing     Required time      Decreasing rate
        hugepages                     (msec)         (pages/msec)
        ------------------------------------------------------------
        10,000 pages == 20GB         70 -  74          135-142
        30,000 pages == 60GB        208 - 229          131-144
      
      It means decrement of 6TB hugepages will trigger softlockup with the
      default threshold 20sec, in this decreasing rate.
      Signed-off-by: default avatarMasayoshi Mizuma <m.mizuma@jp.fujitsu.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      057d8772
    • Vlastimil Babka's avatar
      mm: try_to_unmap_cluster() should lock_page() before mlocking · 8e8836ab
      Vlastimil Babka authored
      commit 57e68e9c upstream.
      
      A BUG_ON(!PageLocked) was triggered in mlock_vma_page() by Sasha Levin
      fuzzing with trinity.  The call site try_to_unmap_cluster() does not lock
      the pages other than its check_page parameter (which is already locked).
      
      The BUG_ON in mlock_vma_page() is not documented and its purpose is
      somewhat unclear, but apparently it serializes against page migration,
      which could otherwise fail to transfer the PG_mlocked flag.  This would
      not be fatal, as the page would be eventually encountered again, but
      NR_MLOCK accounting would become distorted nevertheless.  This patch adds
      a comment to the BUG_ON in mlock_vma_page() and munlock_vma_page() to that
      effect.
      
      The call site try_to_unmap_cluster() is fixed so that for page !=
      check_page, trylock_page() is attempted (to avoid possible deadlocks as we
      already have check_page locked) and mlock_vma_page() is performed only
      upon success.  If the page lock cannot be obtained, the page is left
      without PG_mlocked, which is again not a problem in the whole unevictable
      memory design.
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      8e8836ab
    • Nicholas Bellinger's avatar
      iscsi-target: Fix ERL=2 ASYNC_EVENT connection pointer bug · 9acd4122
      Nicholas Bellinger authored
      commit d444edc6 upstream.
      
      This patch fixes a long-standing bug in iscsit_build_conn_drop_async_message()
      where during ERL=2 connection recovery, a bogus conn_p pointer could
      end up being used to send the ISCSI_OP_ASYNC_EVENT + DROPPING_CONNECTION
      notifying the initiator that cmd->logout_cid has failed.
      
      The bug was manifesting itself as an OOPs in iscsit_allocate_cmd() with
      a bogus conn_p pointer in iscsit_build_conn_drop_async_message().
      Reported-by: default avatarArshad Hussain <arshad.hussain@calsoftinc.com>
      Reported-by: default avatarsantosh kulkarni <santosh.kulkarni@calsoftinc.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9acd4122
    • alex chen's avatar
      ocfs2: do not put bh when buffer_uptodate failed · 4c5dd0ae
      alex chen authored
      commit f7cf4f5b upstream.
      
      Do not put bh when buffer_uptodate failed in ocfs2_write_block and
      ocfs2_write_super_or_backup, because it will put bh in b_end_io.
      Otherwise it will hit a warning "VFS: brelse: Trying to free free
      buffer".
      Signed-off-by: default avatarAlex Chen <alex.chen@huawei.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Acked-by: default avatarJoel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4c5dd0ae
    • Junxiao Bi's avatar
      ocfs2: dlm: fix recovery hung · f5c4690c
      Junxiao Bi authored
      commit ded2cf71 upstream.
      
      There is a race window in dlm_do_recovery() between dlm_remaster_locks()
      and dlm_reset_recovery() when the recovery master nearly finish the
      recovery process for a dead node.  After the master sends FINALIZE_RECO
      message in dlm_remaster_locks(), another node may become the recovery
      master for another dead node, and then send the BEGIN_RECO message to
      all the nodes included the old master, in the handler of this message
      dlm_begin_reco_handler() of old master, dlm->reco.dead_node and
      dlm->reco.new_master will be set to the second dead node and the new
      master, then in dlm_reset_recovery(), these two variables will be reset
      to default value.  This will cause new recovery master can not finish
      the recovery process and hung, at last the whole cluster will hung for
      recovery.
      
      old recovery master:                                 new recovery master:
      dlm_remaster_locks()
                                                        become recovery master for
                                                        another dead node.
                                                        dlm_send_begin_reco_message()
      dlm_begin_reco_handler()
      {
       if (dlm->reco.state & DLM_RECO_STATE_FINALIZE) {
        return -EAGAIN;
       }
       dlm_set_reco_master(dlm, br->node_idx);
       dlm_set_reco_dead_node(dlm, br->dead_node);
      }
      dlm_reset_recovery()
      {
       dlm_set_reco_dead_node(dlm, O2NM_INVALID_NODE_NUM);
       dlm_set_reco_master(dlm, O2NM_INVALID_NODE_NUM);
      }
                                                        will hang in dlm_remaster_locks() for
                                                        request dlm locks info
      
      Before send FINALIZE_RECO message, recovery master should set
      DLM_RECO_STATE_FINALIZE for itself and clear it after the recovery done,
      this can break the race windows as the BEGIN_RECO messages will not be
      handled before DLM_RECO_STATE_FINALIZE flag is cleared.
      
      A similar race may happen between new recovery master and normal node
      which is in dlm_finalize_reco_handler(), also fix it.
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Reviewed-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f5c4690c
    • Junxiao Bi's avatar
      ocfs2: dlm: fix lock migration crash · ca2ba53d
      Junxiao Bi authored
      commit 34aa8dac upstream.
      
      This issue was introduced by commit 800deef3 ("ocfs2: use
      list_for_each_entry where benefical") in 2007 where it replaced
      list_for_each with list_for_each_entry.  The variable "lock" will point
      to invalid data if "tmpq" list is empty and a panic will be triggered
      due to this.  Sunil advised reverting it back, but the old version was
      also not right.  At the end of the outer for loop, that
      list_for_each_entry will also set "lock" to an invalid data, then in the
      next loop, if the "tmpq" list is empty, "lock" will be an stale invalid
      data and cause the panic.  So reverting the list_for_each back and reset
      "lock" to NULL to fix this issue.
      
      Another concern is that this seemes can not happen because the "tmpq"
      list should not be empty.  Let me describe how.
      
      old lock resource owner(node 1):                                  migratation target(node 2):
      image there's lockres with a EX lock from node 2 in
      granted list, a NR lock from node x with convert_type
      EX in converting list.
      dlm_empty_lockres() {
       dlm_pick_migration_target() {
         pick node 2 as target as its lock is the first one
         in granted list.
       }
       dlm_migrate_lockres() {
         dlm_mark_lockres_migrating() {
           res->state |= DLM_LOCK_RES_BLOCK_DIRTY;
           wait_event(dlm->ast_wq, !dlm_lockres_is_dirty(dlm, res));
      	 //after the above code, we can not dirty lockres any more,
           // so dlm_thread shuffle list will not run
                                                                         downconvert lock from EX to NR
                                                                         upconvert lock from NR to EX
      <<< migration may schedule out here, then
      <<< node 2 send down convert request to convert type from EX to
      <<< NR, then send up convert request to convert type from NR to
      <<< EX, at this time, lockres granted list is empty, and two locks
      <<< in the converting list, node x up convert lock followed by
      <<< node 2 up convert lock.
      
      	 // will set lockres RES_MIGRATING flag, the following
      	 // lock/unlock can not run
           dlm_lockres_release_ast(dlm, res);
         }
      
         dlm_send_one_lockres()
                                                                       dlm_process_recovery_data()
                                                                         for (i=0; i<mres->num_locks; i++)
                                                                           if (ml->node == dlm->node_num)
                                                                             for (j = DLM_GRANTED_LIST; j <= DLM_BLOCKED_LIST; j++) {
                                                                              list_for_each_entry(lock, tmpq, list)
                                                                              if (lock) break; <<< lock is invalid as grant list is empty.
                                                                             }
                                                                             if (lock->ml.node != ml->node)
                                                                               BUG() >>> crash here
       }
      
      I see the above locks status from a vmcore of our internal bug.
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Reviewed-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ca2ba53d
    • Matt Fleming's avatar
      sh: fix format string bug in stack tracer · 75c623ae
      Matt Fleming authored
      commit a0c32761 upstream.
      
      Kees reported the following error:
      
         arch/sh/kernel/dumpstack.c: In function 'print_trace_address':
         arch/sh/kernel/dumpstack.c:118:2: error: format not a string literal and no format arguments [-Werror=format-security]
      
      Use the "%s" format so that it's impossible to interpret 'data' as a
      format string.
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      Reported-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      75c623ae
    • Alex Deucher's avatar
      drm/radeon: call drm_edid_to_eld when we update the edid · f81305b3
      Alex Deucher authored
      commit 16086279 upstream.
      
      This needs to be done to update some of the fields in
      the connector structure used by the audio code.
      
      Noticed by several users on irc.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f81305b3
    • Christopher Friedt's avatar
      drm/vmwgfx: correct fb_fix_screeninfo.line_length · 8dd0cd27
      Christopher Friedt authored
      commit aa6de142 upstream.
      
      Previously, the vmwgfx_fb driver would allow users to call FBIOSET_VINFO, but it would not adjust
      the FINFO properly, resulting in distorted screen rendering. The patch corrects that behaviour.
      
      See https://bugs.gentoo.org/show_bug.cgi?id=494794 for examples.
      Signed-off-by: default avatarChristopher Friedt <chrisfriedt@gmail.com>
      Reviewed-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      8dd0cd27
    • Jeff Mahoney's avatar
      reiserfs: fix race in readdir · 078a6830
      Jeff Mahoney authored
      commit 01d88857 upstream.
      
      jdm-20004 reiserfs_delete_xattrs: Couldn't delete all xattrs (-2)
      
      The -ENOENT is due to readdir calling dir_emit on the same entry twice.
      
      If the dir_emit callback sleeps and the tree is changed underneath us,
      we won't be able to trust deh_offset(deh) anymore. We need to save
      next_pos before we might sleep so we can find the next entry.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      078a6830
    • Yann Droneaud's avatar
      IB/ehca: Returns an error on ib_copy_to_udata() failure · 47b1cf90
      Yann Droneaud authored
      commit 5bdb0f02 upstream.
      
      In case of error when writing to userspace, function ehca_create_cq()
      does not set an error code before following its error path.
      
      This patch sets the error code to -EFAULT when ib_copy_to_udata()
      fails.
      
      This was caught when using spatch (aka. coccinelle)
      to rewrite call to ib_copy_{from,to}_udata().
      
      Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci
      Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.comSigned-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      47b1cf90
    • Yann Droneaud's avatar
      IB/mthca: Return an error on ib_copy_to_udata() failure · 44a52401
      Yann Droneaud authored
      commit 08e74c4b upstream.
      
      In case of error when writing to userspace, the function mthca_create_cq()
      does not set an error code before following its error path.
      
      This patch sets the error code to -EFAULT when ib_copy_to_udata() fails.
      
      This was caught when using spatch (aka. coccinelle)
      to rewrite call to ib_copy_{from,to}_udata().
      
      Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci
      Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.comSigned-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      44a52401
    • W. Trevor King's avatar
      ALSA: hda - Enable beep for ASUS 1015E · 38c4a015
      W. Trevor King authored
      commit a4b7f21d upstream.
      
      The `lspci -nnvv` output contains (wrapped for line length):
      
        00:1b.0 Audio device [0403]:
          Intel Corporation 7 Series/C210 Series Chipset Family
          High Definition Audio Controller [8086:1e20] (rev 04)
              Subsystem: ASUSTeK Computer Inc. Device [1043:115d]
      Signed-off-by: default avatarW. Trevor King <wking@tremily.us>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      38c4a015
    • Huacai Chen's avatar
      MIPS: Hibernate: Flush TLB entries in swsusp_arch_resume() · 258f9621
      Huacai Chen authored
      commit c14af233 upstream.
      
      The original MIPS hibernate code flushes cache and TLB entries in
      swsusp_arch_resume(). But they are removed in Commit 44eeab67
      (MIPS: Hibernation: Remove SMP TLB and cacheflushing code.). A cross-
      CPU flush is surely unnecessary because all but the local CPU have
      already been disabled. But a local flush (at least the TLB flush) is
      needed. When we do hibernation on Loongson-3 with an E1000E NIC, it is
      very easy to produce a kernel panic (kernel page fault, or unaligned
      access). The root cause is E1000E driver use vzalloc_node() to allocate
      pages, the stale TLB entries of the booting kernel will be misused by
      the resumed target kernel.
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Cc: John Crispin <john@phrozen.org>
      Cc: Steven J. Hill <Steven.Hill@imgtec.com>
      Cc: Aurelien Jarno <aurelien@aurel32.net>
      Cc: linux-mips@linux-mips.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Patchwork: https://patchwork.linux-mips.org/patch/6643/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      258f9621
    • J. Bruce Fields's avatar
      nfsd4: fix setclientid encode size · 2791a9e4
      J. Bruce Fields authored
      commit 480efaee upstream.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2791a9e4
    • Mike Snitzer's avatar
      dm thin: fix dangling bio in process_deferred_bios error path · 28ba5aae
      Mike Snitzer authored
      commit fe76cd88 upstream.
      
      If unable to ensure_next_mapping() we must add the current bio, which
      was removed from the @bios list via bio_list_pop, back to the
      deferred_bios list before all the remaining @bios.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      28ba5aae
    • Jani Nikula's avatar
      drm/i915/tv: fix gen4 composite s-video tv-out · e6f13570
      Jani Nikula authored
      commit e1f23f3d upstream.
      
      This is *not* bisected, but the likely regression is
      
      commit c3561438
      Author: Zhao Yakui <yakui.zhao@intel.com>
      Date:   Tue Nov 24 09:48:48 2009 +0800
      
          drm/i915: Don't set up the TV port if it isn't in the BIOS table.
      
      The commit does not check for all TV device types that might be present
      in the VBT, disabling TV out for the missing ones. Add composite
      S-video.
      Reported-and-tested-by: default avatarMatthew Khouzam <matthew.khouzam@gmail.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73362Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      [bwh: Backported to 3.2: s/old\.device_type/device_type/]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e6f13570
    • J. Bruce Fields's avatar
      nfsd: notify_change needs elevated write count · 8389a2e9
      J. Bruce Fields authored
      commit 9f67f189 upstream.
      
      Looks like this bug has been here since these write counts were
      introduced, not sure why it was just noticed now.
      
      Thanks also to Jan Kara for pointing out the problem.
      Reported-by: default avatarMatthew Rahtz <mrahtz@rapitasystems.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      8389a2e9
    • Ben Hutchings's avatar
      nfsd: Add fh_{want,drop}_write() · 3863e89c
      Ben Hutchings authored
      Part of commit bad0dcff
      ('new helpers: fh_{want,drop}_write()') upstream.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3863e89c
    • J. Bruce Fields's avatar
      4b739882
    • J. Bruce Fields's avatar
      nfsd4: buffer-length check for SUPPATTR_EXCLCREAT · c674ae09
      J. Bruce Fields authored
      commit de3997a7 upstream.
      
      This was an omission from 8c18f205
      "nfsd41: SUPPATTR_EXCLCREAT attribute".
      
      Cc: Benny Halevy <bhalevy@primarydata.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c674ae09
    • Jason Wang's avatar
      x86, hyperv: Bypass the timer_irq_works() check · 5bb8af45
      Jason Wang authored
      commit ca3ba2a2 upstream.
      
      This patch bypass the timer_irq_works() check for hyperv guest since:
      
      - It was guaranteed to work.
      - timer_irq_works() may fail sometime due to the lpj calibration were inaccurate
        in a hyperv guest or a buggy host.
      
      In the future, we should get the tsc frequency from hypervisor and use preset
      lpj instead.
      
      [ hpa: I would prefer to not defer things to "the future" in the future... ]
      
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Acked-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Link: http://lkml.kernel.org/r/1393558229-14755-1-git-send-email-jasowang@redhat.comSigned-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5bb8af45
    • Marek Vasut's avatar
      gpio: mxs: Allow for recursive enable_irq_wake() call · 4f86b545
      Marek Vasut authored
      commit a585f87c upstream.
      
      The scenario here is that someone calls enable_irq_wake() from somewhere
      in the code. This will result in the lockdep producing a backtrace as can
      be seen below. In my case, this problem is triggered when using the wl1271
      (TI WlCore) driver found in drivers/net/wireless/ti/ .
      
      The problem cause is rather obvious from the backtrace, but let's outline
      the dependency. enable_irq_wake() grabs the IRQ buslock in irq_set_irq_wake(),
      which in turns calls mxs_gpio_set_wake_irq() . But mxs_gpio_set_wake_irq()
      calls enable_irq_wake() again on the one-level-higher IRQ , thus it tries to
      grab the IRQ buslock again in irq_set_irq_wake() . Because the spinlock in
      irq_set_irq_wake()->irq_get_desc_buslock()->__irq_get_desc_lock() is not
      marked as recursive, lockdep will spew the stuff below.
      
      We know we can safely re-enter the lock, so use IRQ_GC_INIT_NESTED_LOCK to
      fix the spew.
      
       =============================================
       [ INFO: possible recursive locking detected ]
       3.10.33-00012-gf06b763-dirty #61 Not tainted
       ---------------------------------------------
       kworker/0:1/18 is trying to acquire lock:
        (&irq_desc_lock_class){-.-...}, at: [<c00685f0>] __irq_get_desc_lock+0x48/0x88
      
       but task is already holding lock:
        (&irq_desc_lock_class){-.-...}, at: [<c00685f0>] __irq_get_desc_lock+0x48/0x88
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&irq_desc_lock_class);
         lock(&irq_desc_lock_class);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       3 locks held by kworker/0:1/18:
        #0:  (events){.+.+.+}, at: [<c0036308>] process_one_work+0x134/0x4a4
        #1:  ((&fw_work->work)){+.+.+.}, at: [<c0036308>] process_one_work+0x134/0x4a4
        #2:  (&irq_desc_lock_class){-.-...}, at: [<c00685f0>] __irq_get_desc_lock+0x48/0x88
      
       stack backtrace:
       CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 3.10.33-00012-gf06b763-dirty #61
       Workqueue: events request_firmware_work_func
       [<c0013eb4>] (unwind_backtrace+0x0/0xf0) from [<c0011c74>] (show_stack+0x10/0x14)
       [<c0011c74>] (show_stack+0x10/0x14) from [<c005bb08>] (__lock_acquire+0x140c/0x1a64)
       [<c005bb08>] (__lock_acquire+0x140c/0x1a64) from [<c005c6a8>] (lock_acquire+0x9c/0x104)
       [<c005c6a8>] (lock_acquire+0x9c/0x104) from [<c051d5a4>] (_raw_spin_lock_irqsave+0x44/0x58)
       [<c051d5a4>] (_raw_spin_lock_irqsave+0x44/0x58) from [<c00685f0>] (__irq_get_desc_lock+0x48/0x88)
       [<c00685f0>] (__irq_get_desc_lock+0x48/0x88) from [<c0068e78>] (irq_set_irq_wake+0x20/0xf4)
       [<c0068e78>] (irq_set_irq_wake+0x20/0xf4) from [<c027260c>] (mxs_gpio_set_wake_irq+0x1c/0x24)
       [<c027260c>] (mxs_gpio_set_wake_irq+0x1c/0x24) from [<c0068cf4>] (set_irq_wake_real+0x30/0x44)
       [<c0068cf4>] (set_irq_wake_real+0x30/0x44) from [<c0068ee4>] (irq_set_irq_wake+0x8c/0xf4)
       [<c0068ee4>] (irq_set_irq_wake+0x8c/0xf4) from [<c0310748>] (wlcore_nvs_cb+0x10c/0x97c)
       [<c0310748>] (wlcore_nvs_cb+0x10c/0x97c) from [<c02be5e8>] (request_firmware_work_func+0x38/0x58)
       [<c02be5e8>] (request_firmware_work_func+0x38/0x58) from [<c0036394>] (process_one_work+0x1c0/0x4a4)
       [<c0036394>] (process_one_work+0x1c0/0x4a4) from [<c0036a4c>] (worker_thread+0x138/0x394)
       [<c0036a4c>] (worker_thread+0x138/0x394) from [<c003cb74>] (kthread+0xa4/0xb0)
       [<c003cb74>] (kthread+0xa4/0xb0) from [<c000ee00>] (ret_from_fork+0x14/0x34)
       wlcore: loaded
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Acked-by: default avatarShawn Guo <shawn.guo@linaro.org>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4f86b545
    • Josef Bacik's avatar
      Btrfs: fix deadlock with nested trans handles · d57813e9
      Josef Bacik authored
      commit 3bbb24b2 upstream.
      
      Zach found this deadlock that would happen like this
      
      btrfs_end_transaction <- reduce trans->use_count to 0
        btrfs_run_delayed_refs
          btrfs_cow_block
            find_free_extent
      	btrfs_start_transaction <- increase trans->use_count to 1
                allocate chunk
      	btrfs_end_transaction <- decrease trans->use_count to 0
      	  btrfs_run_delayed_refs
      	    lock tree block we are cowing above ^^
      
      We need to only decrease trans->use_count if it is above 1, otherwise leave it
      alone.  This will make nested trans be the only ones who decrease their added
      ref, and will let us get rid of the trans->use_count++ hack if we have to commit
      the transaction.  Thanks,
      Reported-by: default avatarZach Brown <zab@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Tested-by: default avatarZach Brown <zab@redhat.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d57813e9
    • Richard Guy Briggs's avatar
      audit: convert PPIDs to the inital PID namespace. · 135db4db
      Richard Guy Briggs authored
      commit c92cdeb4 upstream.
      
      sys_getppid() returns the parent pid of the current process in its own pid
      namespace.  Since audit filters are based in the init pid namespace, a process
      could avoid a filter or trigger an unintended one by being in an alternate pid
      namespace or log meaningless information.
      
      Switch to task_ppid_nr() for PPIDs to anchor all audit filters in the
      init_pid_ns.
      
      (informed by ebiederman's 6c621b7e)
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      [bwh: Backported to 3.2: sys_getppid() is used by audit_exit() but not
       audit_log_task_info()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      135db4db
    • Richard Guy Briggs's avatar
      pid: get pid_t ppid of task in init_pid_ns · 0ce844ae
      Richard Guy Briggs authored
      commit ad36d282 upstream.
      
      Added the functions task_ppid_nr_ns() and task_ppid_nr() to abstract the lookup
      of the PPID (real_parent's pid_t) of a process, including rcu locking, in the
      arbitrary and init_pid_ns.
      This provides an alternative to sys_getppid(), which is relative to the child
      process' pid namespace.
      
      (informed by ebiederman's 6c621b7e)
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0ce844ae
    • Krzysztof Kozlowski's avatar
      mfd: 88pm860x: Fix possible NULL pointer dereference on i2c_new_dummy error · 209e1b81
      Krzysztof Kozlowski authored
      commit 159ce52a upstream.
      
      During probe the driver allocates dummy I2C device for companion chip
      with i2c_new_dummy() but it does not check the return value of this call.
      
      In case of error (i2c_new_device(): memory allocation failure or I2C
      address cannot be used) this function returns NULL which is later used
      by regmap_init_i2c().
      
      If i2c_new_dummy() fails for companion device, fail also the probe for
      main MFD driver.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      [bwh: Backported to 3.2:
       - Adjust filename, context
       - Add kfree() before return, as driver is not using managed allocations]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      209e1b81
    • Krzysztof Kozlowski's avatar
      mfd: max8925: Fix possible NULL pointer dereference on i2c_new_dummy error · 758e64a5
      Krzysztof Kozlowski authored
      commit 96cf3ded upstream.
      
      During probe the driver allocates dummy I2C devices for RTC and ADC
      with i2c_new_dummy() but it does not check the return value of this
      calls.
      
      In case of error (i2c_new_device(): memory allocation failure or I2C
      address cannot be used) this function returns NULL which is later used
      by i2c_unregister_device().
      
      If i2c_new_dummy() fails for RTC or ADC devices, fail also the probe
      for main MFD driver.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      758e64a5
    • Krzysztof Kozlowski's avatar
      mfd: max8998: Fix possible NULL pointer dereference on i2c_new_dummy error · c330c022
      Krzysztof Kozlowski authored
      commit ed26f87b upstream.
      
      During probe the driver allocates dummy I2C device for RTC with i2c_new_dummy() but it does not check the return value of this call.
      
      In case of error (i2c_new_device(): memory allocation failure or I2C
      address cannot be used) this function returns NULL which is later used
      by i2c_unregister_device().
      
      If i2c_new_dummy() fails for RTC device, fail also the probe for
      main MFD driver.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      c330c022
    • Krzysztof Kozlowski's avatar
      mfd: max8997: Fix possible NULL pointer dereference on i2c_new_dummy error · 792aa59f
      Krzysztof Kozlowski authored
      commit 97dc4ed3 upstream.
      
      During probe the driver allocates dummy I2C devices for RTC, haptic and
      MUIC with i2c_new_dummy() but it does not check the return value of this
      calls.
      
      In case of error (i2c_new_device(): memory allocation failure or I2C
      address cannot be used) this function returns NULL which is later used
      by i2c_unregister_device().
      
      If i2c_new_dummy() fails for RTC, haptic or MUIC devices, fail also the
      probe for main MFD driver.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      792aa59f
    • Linus Walleij's avatar
      mfd: Include all drivers in subsystem menu · fb2da9df
      Linus Walleij authored
      commit a6e6e660 upstream.
      
      It is currently not possible to select the SA1100 or Vexpress
      drivers in the MFD subsystem, because the menu for the entire
      subsystem ends before these options are presented.
      
      Move the main menu closing and the endif for HAS_IOMEM to the
      end of the file so these are selectable again.
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      fb2da9df
    • Yann Droneaud's avatar
      IB/nes: Return an error on ib_copy_from_udata() failure instead of NULL · a3ffdf42
      Yann Droneaud authored
      commit 9d194d10 upstream.
      
      In case of error while accessing to userspace memory, function
      nes_create_qp() returns NULL instead of an error code wrapped through
      ERR_PTR().  But NULL is not expected by ib_uverbs_create_qp(), as it
      check for error with IS_ERR().
      
      As page 0 is likely not mapped, it is going to trigger an Oops when
      the kernel will try to dereference NULL pointer to access to struct
      ib_qp's fields.
      
      In some rare cases, page 0 could be mapped by userspace, which could
      turn this bug to a vulnerability that could be exploited: the function
      pointers in struct ib_device will be under userspace total control.
      
      This was caught when using spatch (aka. coccinelle)
      to rewrite calls to ib_copy_{from,to}_udata().
      
      Link: https://www.gitorious.org/opteya/ib-hw-nes-create-qp-null
      Link: https://www.gitorious.org/opteya/coccib/source/75ebf2c1033c64c1d81df13e4ae44ee99c989eba:ib_copy_udata.cocci
      Link: http://marc.info/?i=cover.1394485254.git.ydroneaud@opteya.comSigned-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a3ffdf42
    • Dennis Dalessandro's avatar
      IB/ipath: Fix potential buffer overrun in sending diag packet routine · a9782df2
      Dennis Dalessandro authored
      commit a2cb0eb8 upstream.
      
      Guard against a potential buffer overrun.  The size to read from the
      user is passed in, and due to the padding that needs to be taken into
      account, as well as the place holder for the ICRC it is possible to
      overflow the 32bit value which would cause more data to be copied from
      user space than is allocated in the buffer.
      Reported-by: default avatarNico Golde <nico@ngolde.de>
      Reported-by: default avatarFabian Yamaguchi <fabs@goesec.de>
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a9782df2
    • Felix Fietkau's avatar
      ath9k: fix ready time of the multicast buffer queue · a7bbda74
      Felix Fietkau authored
      commit 3b3e0efb upstream.
      
      qi->tqi_readyTime is written directly to registers that expect
      microseconds as unit instead of TU.
      When setting the CABQ ready time, cur_conf->beacon_interval is in TU, so
      convert it to microseconds before passing it to ath9k_hw.
      
      This should hopefully fix some Tx DMA issues with buffered multicast
      frames in AP mode.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a7bbda74
    • Eric Whitney's avatar
      ext4: fix partial cluster handling for bigalloc file systems · e0a00412
      Eric Whitney authored
      commit c0634493 upstream.
      
      Commit 9cb00419, which enables hole punching for bigalloc file
      systems, exposed a bug introduced by commit 6ae06ff5 in an earlier
      release.  When run on a bigalloc file system, xfstests generic/013, 068,
      075, 083, 091, 100, 112, 127, 263, 269, and 270 fail with e2fsck errors
      or cause kernel error messages indicating that previously freed blocks
      are being freed again.
      
      The latter commit optimizes the selection of the starting extent in
      ext4_ext_rm_leaf() when hole punching by beginning with the extent
      supplied in the path argument rather than with the last extent in the
      leaf node (as is still done when truncating).  However, the code in
      rm_leaf that initially sets partial_cluster to track cluster sharing on
      extent boundaries is only guaranteed to run if rm_leaf starts with the
      last node in the leaf.  Consequently, partial_cluster is not correctly
      initialized when hole punching, and a cluster on the boundary of a
      punched region that should be retained may instead be deallocated.
      Signed-off-by: default avatarEric Whitney <enwlinux@gmail.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e0a00412
    • Rusty Russell's avatar
      virtio_balloon: don't softlockup on huge balloon changes. · d7a04a3b
      Rusty Russell authored
      commit 1f74ef0f upstream.
      
      When adding or removing 100G from a balloon:
      
          BUG: soft lockup - CPU#0 stuck for 22s! [vballoon:367]
      
      We have a wait_event_interruptible(), but the condition is always true
      (more ballooning to do) so we don't ever sleep.  We also have a
      wait_event() for the host to ack, but that is also always true as QEMU
      is synchronous for balloon operations.
      Reported-by: default avatarGopesh Kumar Chaudhary <gopchaud@in.ibm.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d7a04a3b
    • Emmanuel Grumbach's avatar
      iwlwifi: dvm: take mutex when sending SYNC BT config command · 769ddc4b
      Emmanuel Grumbach authored
      commit 82e5a649 upstream.
      
      There is a flow in which we send the host command in SYNC
      mode, but we don't take priv->mutex.
      
      Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1046495Reviewed-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      [bwh: Backported to 3.2:
       - Adjust filename, context
       - mutex is priv->shrd->mutex]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      769ddc4b
    • Ajesh Kunhipurayil Vijayan's avatar
      jffs2: Fix crash due to truncation of csize · 65417cd8
      Ajesh Kunhipurayil Vijayan authored
      commit 41bf1a24 upstream.
      
      mounting JFFS2 partition sometimes crashes with this call trace:
      
      [ 1322.240000] Kernel bug detected[#1]:
      [ 1322.244000] Cpu 2
      [ 1322.244000] $ 0   : 0000000000000000 0000000000000018 000000003ff00070 0000000000000001
      [ 1322.252000] $ 4   : 0000000000000000 c0000000f3980150 0000000000000000 0000000000010000
      [ 1322.260000] $ 8   : ffffffffc09cd5f8 0000000000000001 0000000000000088 c0000000ed300de8
      [ 1322.268000] $12   : e5e19d9c5f613a45 ffffffffc046d464 0000000000000000 66227ba5ea67b74e
      [ 1322.276000] $16   : c0000000f1769c00 c0000000ed1e0200 c0000000f3980150 0000000000000000
      [ 1322.284000] $20   : c0000000f3a80000 00000000fffffffc c0000000ed2cfbd8 c0000000f39818f0
      [ 1322.292000] $24   : 0000000000000004 0000000000000000
      [ 1322.300000] $28   : c0000000ed2c0000 c0000000ed2cfab8 0000000000010000 ffffffffc039c0b0
      [ 1322.308000] Hi    : 000000000000023c
      [ 1322.312000] Lo    : 000000000003f802
      [ 1322.316000] epc   : ffffffffc039a9f8 check_tn_node+0x88/0x3b0
      [ 1322.320000]     Not tainted
      [ 1322.324000] ra    : ffffffffc039c0b0 jffs2_do_read_inode_internal+0x1250/0x1e48
      [ 1322.332000] Status: 5400f8e3    KX SX UX KERNEL EXL IE
      [ 1322.336000] Cause : 00800034
      [ 1322.340000] PrId  : 000c1004 (Netlogic XLP)
      [ 1322.344000] Modules linked in:
      [ 1322.348000] Process jffs2_gcd_mtd7 (pid: 264, threadinfo=c0000000ed2c0000, task=c0000000f0e68dd8, tls=0000000000000000)
      [ 1322.356000] Stack : c0000000f1769e30 c0000000ed010780 c0000000ed010780 c0000000ed300000
              c0000000f1769c00 c0000000f3980150 c0000000f3a80000 00000000fffffffc
              c0000000ed2cfbd8 ffffffffc039c0b0 ffffffffc09c6340 0000000000001000
              0000000000000dec ffffffffc016c9d8 c0000000f39805a0 c0000000f3980180
              0000008600000000 0000000000000000 0000000000000000 0000000000000000
              0001000000000dec c0000000f1769d98 c0000000ed2cfb18 0000000000010000
              0000000000010000 0000000000000044 c0000000f3a80000 c0000000f1769c00
              c0000000f3d207a8 c0000000f1769d98 c0000000f1769de0 ffffffffc076f9c0
              0000000000000009 0000000000000000 0000000000000000 ffffffffc039cf90
              0000000000000017 ffffffffc013fbdc 0000000000000001 000000010003e61c
              ...
      [ 1322.424000] Call Trace:
      [ 1322.428000] [<ffffffffc039a9f8>] check_tn_node+0x88/0x3b0
      [ 1322.432000] [<ffffffffc039c0b0>] jffs2_do_read_inode_internal+0x1250/0x1e48
      [ 1322.440000] [<ffffffffc039cf90>] jffs2_do_crccheck_inode+0x70/0xd0
      [ 1322.448000] [<ffffffffc03a1b80>] jffs2_garbage_collect_pass+0x160/0x870
      [ 1322.452000] [<ffffffffc03a392c>] jffs2_garbage_collect_thread+0xdc/0x1f0
      [ 1322.460000] [<ffffffffc01541c8>] kthread+0xb8/0xc0
      [ 1322.464000] [<ffffffffc0106d18>] kernel_thread_helper+0x10/0x18
      [ 1322.472000]
      [ 1322.472000]
      Code: 67bd0050  94a4002c  2c830001 <00038036> de050218  2403fffc  0080a82d  00431824  24630044
      [ 1322.480000] ---[ end trace b052bb90e97dfbf5 ]---
      
      The variable csize in structure jffs2_tmp_dnode_info is of type uint16_t, but it
      is used to hold the compressed data length(csize) which is declared as uint32_t.
      So, when the value of csize exceeds 16bits, it gets truncated when assigned to
      tn->csize. This is causing a kernel BUG.
      Changing the definition of csize in jffs2_tmp_dnode_info to uint32_t fixes the issue.
      Signed-off-by: default avatarAjesh Kunhipurayil Vijayan <ajesh@broadcom.com>
      Signed-off-by: default avatarKamlakant Patel <kamlakant.patel@broadcom.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      65417cd8
    • Kamlakant Patel's avatar
      jffs2: Fix segmentation fault found in stress test · 9634d073
      Kamlakant Patel authored
      commit 3367da56 upstream.
      
      Creating a large file on a JFFS2 partition sometimes crashes with this call
      trace:
      
      [  306.476000] CPU 13 Unable to handle kernel paging request at virtual address c0000000dfff8002, epc == ffffffffc03a80a8, ra == ffffffffc03a8044
      [  306.488000] Oops[#1]:
      [  306.488000] Cpu 13
      [  306.492000] $ 0   : 0000000000000000 0000000000000000 0000000000008008 0000000000008007
      [  306.500000] $ 4   : c0000000dfff8002 000000000000009f c0000000e0007cde c0000000ee95fa58
      [  306.508000] $ 8   : 0000000000000001 0000000000008008 0000000000010000 ffffffffffff8002
      [  306.516000] $12   : 0000000000007fa9 000000000000ff0e 000000000000ff0f 80e55930aebb92bb
      [  306.524000] $16   : c0000000e0000000 c0000000ee95fa5c c0000000efc80000 ffffffffc09edd70
      [  306.532000] $20   : ffffffffc2b60000 c0000000ee95fa58 0000000000000000 c0000000efc80000
      [  306.540000] $24   : 0000000000000000 0000000000000004
      [  306.548000] $28   : c0000000ee950000 c0000000ee95f738 0000000000000000 ffffffffc03a8044
      [  306.556000] Hi    : 00000000000574a5
      [  306.560000] Lo    : 6193b7a7e903d8c9
      [  306.564000] epc   : ffffffffc03a80a8 jffs2_rtime_compress+0x98/0x198
      [  306.568000]     Tainted: G        W
      [  306.572000] ra    : ffffffffc03a8044 jffs2_rtime_compress+0x34/0x198
      [  306.580000] Status: 5000f8e3    KX SX UX KERNEL EXL IE
      [  306.584000] Cause : 00800008
      [  306.588000] BadVA : c0000000dfff8002
      [  306.592000] PrId  : 000c1100 (Netlogic XLP)
      [  306.596000] Modules linked in:
      [  306.596000] Process dd (pid: 170, threadinfo=c0000000ee950000, task=c0000000ee6e0858, tls=0000000000c47490)
      [  306.608000] Stack : 7c547f377ddc7ee4 7ffc7f967f5d7fae 7f617f507fc37ff4 7e7d7f817f487f5f
              7d8e7fec7ee87eb3 7e977ff27eec7f9e 7d677ec67f917f67 7f3d7e457f017ed7
              7fd37f517f867eb2 7fed7fd17ca57e1d 7e5f7fe87f257f77 7fd77f0d7ede7fdb
              7fba7fef7e197f99 7fde7fe07ee37eb5 7f5c7f8c7fc67f65 7f457fb87f847e93
              7f737f3e7d137cd9 7f8e7e9c7fc47d25 7dbb7fac7fb67e52 7ff17f627da97f64
              7f6b7df77ffa7ec5 80057ef17f357fb3 7f767fa27dfc7fd5 7fe37e8e7fd07e53
              7e227fcf7efb7fa1 7f547e787fa87fcc 7fcb7fc57f5a7ffb 7fc07f6c7ea97e80
              7e2d7ed17e587ee0 7fb17f9d7feb7f31 7f607e797e887faa 7f757fdd7c607ff3
              7e877e657ef37fbd 7ec17fd67fe67ff7 7ff67f797ff87dc4 7eef7f3a7c337fa6
              7fe57fc97ed87f4b 7ebe7f097f0b8003 7fe97e2a7d997cba 7f587f987f3c7fa9
              ...
      [  306.676000] Call Trace:
      [  306.680000] [<ffffffffc03a80a8>] jffs2_rtime_compress+0x98/0x198
      [  306.684000] [<ffffffffc0394f10>] jffs2_selected_compress+0x110/0x230
      [  306.692000] [<ffffffffc039508c>] jffs2_compress+0x5c/0x388
      [  306.696000] [<ffffffffc039dc58>] jffs2_write_inode_range+0xd8/0x388
      [  306.704000] [<ffffffffc03971bc>] jffs2_write_end+0x16c/0x2d0
      [  306.708000] [<ffffffffc01d3d90>] generic_file_buffered_write+0xf8/0x2b8
      [  306.716000] [<ffffffffc01d4e7c>] __generic_file_aio_write+0x1ac/0x350
      [  306.720000] [<ffffffffc01d50a0>] generic_file_aio_write+0x80/0x168
      [  306.728000] [<ffffffffc021f7dc>] do_sync_write+0x94/0xf8
      [  306.732000] [<ffffffffc021ff6c>] vfs_write+0xa4/0x1a0
      [  306.736000] [<ffffffffc02202e8>] SyS_write+0x50/0x90
      [  306.744000] [<ffffffffc0116cc0>] handle_sys+0x180/0x1a0
      [  306.748000]
      [  306.748000]
      Code: 020b202d  0205282d  90a50000 <90840000> 14a40038  00000000  0060602d  0000282d  016c5823
      [  306.760000] ---[ end trace 79dd088435be02d0 ]---
      Segmentation fault
      
      This crash is caused because the 'positions' is declared as an array of signed
      short. The value of position is in the range 0..65535, and will be converted
      to a negative number when the position is greater than 32767 and causes a
      corruption and crash. Changing the definition to 'unsigned short' fixes this
      issue
      Signed-off-by: default avatarJayachandran C <jchandra@broadcom.com>
      Signed-off-by: default avatarKamlakant Patel <kamlakant.patel@broadcom.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9634d073
    • Li Zefan's avatar
      jffs2: avoid soft-lockup in jffs2_reserve_space_gc() · 62a9eee3
      Li Zefan authored
      commit 13b546d9 upstream.
      
      We triggered soft-lockup under stress test on 2.6.34 kernel.
      
      BUG: soft lockup - CPU#1 stuck for 60009ms! [lockf2.test:14488]
      ...
      [<bf09a4d4>] (jffs2_do_reserve_space+0x420/0x440 [jffs2])
      [<bf09a528>] (jffs2_reserve_space_gc+0x34/0x78 [jffs2])
      [<bf0a1350>] (jffs2_garbage_collect_dnode.isra.3+0x264/0x478 [jffs2])
      [<bf0a2078>] (jffs2_garbage_collect_pass+0x9c0/0xe4c [jffs2])
      [<bf09a670>] (jffs2_reserve_space+0x104/0x2a8 [jffs2])
      [<bf09dc48>] (jffs2_write_inode_range+0x5c/0x4d4 [jffs2])
      [<bf097d8c>] (jffs2_write_end+0x198/0x2c0 [jffs2])
      [<c00e00a4>] (generic_file_buffered_write+0x158/0x200)
      [<c00e14f4>] (__generic_file_aio_write+0x3a4/0x414)
      [<c00e15c0>] (generic_file_aio_write+0x5c/0xbc)
      [<c012334c>] (do_sync_write+0x98/0xd4)
      [<c0123a84>] (vfs_write+0xa8/0x150)
      [<c0123d74>] (sys_write+0x3c/0xc0)]
      
      Fix this by adding a cond_resched() in the while loop.
      
      [akpm@linux-foundation.org: don't initialize `ret']
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      62a9eee3