1. 06 Jun, 2013 5 commits
  2. 03 Jun, 2013 11 commits
  3. 27 May, 2013 1 commit
  4. 26 May, 2013 9 commits
    • Linus Torvalds's avatar
      Linux 3.10-rc3 · e4aa937e
      Linus Torvalds authored
      e4aa937e
    • Manfred Spraul's avatar
      ipc/sem.c: Fix missing wakeups in do_smart_update_queue() · ab465df9
      Manfred Spraul authored
      do_smart_update_queue() is called when an operation (semop,
      semctl(SETVAL), semctl(SETALL), ...) modified the array.  It must check
      which of the sleeping tasks can proceed.
      
      do_smart_update_queue() missed a few wakeups:
       - if a sleeping complex op was completed, then all per-semaphore queues
         must be scanned - not only those that were modified by *sops
       - if a sleeping simple op proceeded, then the global queue must be
         scanned again
      
      And:
       - the test for "|sops == NULL) before scanning the global queue is not
         required: If the global queue is empty, then it doesn't need to be
         scanned - regardless of the reason for calling do_smart_update_queue()
      
      The patch is not optimized, i.e.  even completing a wait-for-zero
      operation causes a rescan.  This is done to keep the patch as simple as
      possible.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Acked-by: default avatarDavidlohr Bueso <davidlohr.bueso@hp.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab465df9
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 89ff7783
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
      
       - Stable fix to prevent an rpc_task wakeup race
       - Fix a NFSv4.1 session drain deadlock
       - Fix a NFSv4/v4.1 mount regression when not running rpc.gssd
       - Ensure auth_gss pipe detection works in namespaces
       - Fix SETCLIENTID fallback if rpcsec_gss is not available
      
      * tag 'nfs-for-3.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFS: Fix SETCLIENTID fallback if GSS is not available
        SUNRPC: Prevent an rpc_task wakeup race
        NFSv4.1 Fix a pNFS session draining deadlock
        SUNRPC: Convert auth_gss pipe detection to work in namespaces
        SUNRPC: Faster detection if gssd is actually running
        SUNRPC: Fix a bug in gss_create_upcall
      89ff7783
    • Linus Torvalds's avatar
      Merge tag 'edac_fixes_for_3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp · 932ff06b
      Linus Torvalds authored
      Pull amd64 edac fix from Borislav Petkov:
       "A sysfs file permissions correction"
      
      * tag 'edac_fixes_for_3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
        amd64_edac: Fix bogus sysfs file permissions
      932ff06b
    • Linus Torvalds's avatar
      Merge branch 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 95f4838e
      Linus Torvalds authored
      Pull parisc fixes from Helge Deller:
       "This time we made the kernel- and interruption stack allocation
        reentrant which fixed some strange kernel crashes (specifically
        protection ID traps).
      
        Furthemore this patchset fixes the interrupt stack in UP and SMP
        configurations by using native locking instructions.  And finally
        usage of floating point calculations on parisc were disabled in the
        MPILIB."
      
      * 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: fix irq stack on UP and SMP
        parisc/superio: Use module_pci_driver to register driver
        parisc: make interrupt and interruption stack allocation reentrant
        parisc: show number of FPE and unaligned access handler calls in /proc/interrupts
        parisc: add additional parisc git tree to MAINTAINERS file
        parisc: use PAGE_SHIFT instead of hardcoded value 12 in pacache.S
        parisc: add rp5470 entry to machine database
        MPILIB: disable usage of floating point registers on parisc
      95f4838e
    • Linus Torvalds's avatar
      Merge tag 'for-linus-v3.10-rc3' of git://oss.sgi.com/xfs/xfs · 088d812f
      Linus Torvalds authored
      Pull xfs fixes from Ben Myers:
       "Here are fixes for corruption on 512 byte filesystems, a rounding
        error, a use-after-free, some flags to fix lockdep reports, and
        several fixes related to CRCs.  We have a somewhat larger post -rc1
        queue than usual due to fixes related to the CRC feature we merged for
        3.10:
      
         - Fix for corruption with FSX on 512 byte blocksize filesystems
         - Fix rounding error in xfs_free_file_space
         - Fix use-after-free with extent free intents
         - Add several missing KM_NOFS flags to fix lockdep reports
         - Several fixes for CRC related code"
      
      * tag 'for-linus-v3.10-rc3' of git://oss.sgi.com/xfs/xfs:
        xfs: remote attribute lookups require the value length
        xfs: xfs_attr_shortform_allfit() does not handle attr3 format.
        xfs: xfs_da3_node_read_verify() doesn't handle XFS_ATTR3_LEAF_MAGIC
        xfs: fix missing KM_NOFS tags to keep lockdep happy
        xfs: Don't reference the EFI after it is freed
        xfs: fix rounding in xfs_free_file_space
        xfs: fix sub-page blocksize data integrity writes
      088d812f
    • Linus Torvalds's avatar
      Merge tag 'for-v3.10-fixes' of git://git.infradead.org/battery-2.6 · 72de4c63
      Linus Torvalds authored
      Pull bettery fixes from Anton Vorontsov:
       "Last minute one-liners: wrong kfree usage fix, module alias fixup and
        kconfig adjustments"
      
      * tag 'for-v3.10-fixes' of git://git.infradead.org/battery-2.6:
        pm2301_charger: Fix module alias prefix
        wm831x_backup: Fix wrong kfree call for devdata->backup.name
        bq27x00: Fix I2C dependency in KConfig
        lp8788-charger: Fix kconfig dependency
      72de4c63
    • Linus Torvalds's avatar
      Merge tag 'pm+acpi-3.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 1aad08dc
      Linus Torvalds authored
      Pull power management and ACPI fixes from Rafael Wysocki:
      
       - Additional CPU ID for the intel_pstate driver from Dirk Brandewie.
      
       - More cpufreq fixes related to ARM big.LITTLE support and locking from
         Viresh Kumar.
      
       - VIA C7 cpufreq build fix from Rafał Bilski.
      
       - ACPI power management fix making it possible to use device power
         states regardless of the CONFIG_PM setting from Rafael J Wysocki.
      
       - New ACPI video blacklist item from Bastian Triller.
      
      * tag 'pm+acpi-3.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI / video: Add "Asus UL30A" to ACPI video detect blacklist
        cpufreq: arm_big_little_dt: Instantiate as platform_driver
        cpufreq: arm_big_little_dt: Register driver only if DT has valid data
        cpufreq / e_powersaver: Fix linker error when ACPI processor is a module
        cpufreq / intel_pstate: Add additional supported CPU ID
        cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT
        ACPI / PM: Allow device power states to be used for CONFIG_PM unset
      1aad08dc
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma · 27a24cfa
      Linus Torvalds authored
      Pull slave-dma fixes from Vinod Koul:
       "We have two patches from Andy & Rafael fixing the Lynxpoint dma"
      
      * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
        ACPI / LPSS: register clock device for Lynxpoint DMA properly
        dma: acpi-dma: parse CSRT to extract additional resources
      27a24cfa
  5. 25 May, 2013 5 commits
    • Kyle McMartin's avatar
      score: remove redundant kcore_list entries · 6b3f7b5c
      Kyle McMartin authored
      kcore_vmalloc is in fs/proc/kcore.c and kcore_mem is unused across
      the tree. Noticed while grepping the tree for some other kcore stuff.
      
      (score looks pretty unmaintained to me.)
      Signed-off-by: default avatarKyle McMartin <kyle@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6b3f7b5c
    • Linus Torvalds's avatar
      Merge tag 'arc-v3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 462a2b58
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
      
       - Fallouts/wreckage of Cache Flush optimizations / aliasing dcache
         support
      
       - Fix for an interesting bug where piped input to grep was getting
         mysteriously clobbered
      
      * tag 'arc-v3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: lazy dcache flush broke gdb in non-aliasing configs
        ARC: Use enough bits for determining page's cache color
        ARC: Brown paper bag bug in macro for checking cache color
        ARC: copy_(to|from)_user() to honor usermode-access permissions
        ARC: [mm] Prevent stray dcache lines after__sync_icache_dcach()
        ARC: [TB10x] Remove redundant abilis,simple-pinctrl mechanism
      462a2b58
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm · 4dd9aa89
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "Just three this time, all really quite small"
      
      * 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
        ARM: 7729/1: vfp: ensure VFP_arch is non-zero when VFP is not supported
        ARM: 7727/1: remove the .vm_mm value from gate_vma
        ARM: 7723/1: crypto: sha1-armv4-large.S: fix SP handling
      4dd9aa89
    • Vineet Gupta's avatar
      ARC: lazy dcache flush broke gdb in non-aliasing configs · 7bb66f6e
      Vineet Gupta authored
      gdbserver inserting a breakpoint ends up calling copy_user_page() for a
      code page. The generic version of which (non-aliasing config) didn't set
      the PG_arch_1 bit hence update_mmu_cache() didn't sync dcache/icache for
      corresponding dynamic loader code page - causing garbade to be executed.
      
      So now aliasing versions of copy_user_highpage()/clear_page() are made
      default. There is no significant overhead since all of special alias
      handling code is compiled out for non-aliasing build
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      7bb66f6e
    • Linus Torvalds's avatar
      Merge branch 'akpm' (incoming from Andrew Morton) · 9cf18482
      Linus Torvalds authored
      Merge fixes from Andrew Morton:
       "A bunch of fixes and one simple fbdev driver which missed the merge
        window because people will still talking about it (to no great
        effect)."
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (30 commits)
        aio: fix kioctx not being freed after cancellation at exit time
        mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas
        drivers/rtc/rtc-max8998.c: check for pdata presence before dereferencing
        ocfs2: goto out_unlock if ocfs2_get_clusters_nocache() failed in ocfs2_fiemap()
        random: fix accounting race condition with lockless irq entropy_count update
        drivers/char/random.c: fix priming of last_data
        mm/memory_hotplug.c: fix printk format warnings
        nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary
        drivers/block/brd.c: fix brd_lookup_page() race
        fbdev: FB_GOLDFISH should depend on HAS_DMA
        drivers/rtc/rtc-pl031.c: pass correct pointer to free_irq()
        auditfilter.c: fix kernel-doc warnings
        aio: fix io_getevents documentation
        revert "selftest: add simple test for soft-dirty bit"
        drivers/leds/leds-ot200.c: fix error caused by shifted mask
        mm/THP: use pmd_populate() to update the pmd with pgtable_t pointer
        linux/kernel.h: fix kernel-doc warning
        mm compaction: fix of improper cache flush in migration code
        rapidio/tsi721: fix bug in MSI interrupt handling
        hfs: avoid crash in hfs_bnode_create
        ...
      9cf18482
  6. 24 May, 2013 9 commits
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 00cec111
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "We didn't have any fixes sent up for -rc2, so this is a slightly
        larger batch.  A bit all over the place platform-wise; OMAP, at91,
        marvell, renesas, sunxi, ux500, etc.
      
        I tried to summarize highlights but there isn't a whole lot to point
        out.  Lots of little things fixed all over.  A couple of defconfig
        updates due to new/changing options."
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (44 commits)
        ARM: at91/sama5: fix incorrect PMC pcr div definition
        ARM: at91/dt: fix macb pinctrl_macb_rmii_mii_alt definition
        ARM: at91: at91sam9n12: move external irq declatation to DT
        ARM: shmobile: marzen: Use error values in usb_power_*
        ARM: tegra: defconfig fixes
        ARM: nomadik: fix IRQ assignment for SMC ethernet
        ARM: vt8500: Add missing NULL terminator in dt_compat
        clk: tegra: add ac97 controller clock
        clk: tegra: remove USB from clk init table
        ARM: dts: mvebu: Fix wrong the address reg value for the L2-cache node
        ARM: plat-orion: Fix num_resources and id for ge10 and ge11
        ARM: OMAP2+: hwmod: Remove sysc slave idle and auto idle apis
        SERIAL: OMAP: Remove the slave idle handling from the driver
        ARM: OMAP2+: serial: Remove the un-used slave idle hooks
        ARM: OMAP2+: hwmod-data: UART IP needs software control to manage sidle modes
        ARM: OMAP2+: hwmod: Add a new flag to handle SIDLE in SWSUP only in active
        ARM: OMAP2+: hwmod: Fix sidle programming in _enable_sysc()/_idle_sysc()
        arm: mvebu: fix the 'ranges' property to handle PCIe
        ARM: mvebu: select ARCH_REQUIRE_GPIOLIB for mvebu platform
        ARM: AM33XX: Add missing .clkdm_name to clkdiv32k_ick clock
        ...
      00cec111
    • Benjamin LaHaise's avatar
      aio: fix kioctx not being freed after cancellation at exit time · 03e04f04
      Benjamin LaHaise authored
      The recent changes overhauling fs/aio.c introduced a bug that results in
      the kioctx not being freed when outstanding kiocbs are cancelled at
      exit_aio() time.  Specifically, a kiocb that is cancelled has its
      completion events discarded by batch_complete_aio(), which then fails to
      wake up the process stuck in free_ioctx().  Fix this by modifying the
      wait_event() condition in free_ioctx() appropriately.
      
      This patch was tested with the cancel operation in the thread based code
      posted yesterday.
      
      [akpm@linux-foundation.org: fix build]
      Signed-off-by: default avatarBenjamin LaHaise <bcrl@kvack.org>
      Signed-off-by: default avatarKent Overstreet <koverstreet@google.com>
      Cc: Kent Overstreet <koverstreet@google.com>
      Cc: Josh Boyer <jwboyer@redhat.com>
      Cc: Zach Brown <zab@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      03e04f04
    • Cliff Wickman's avatar
      mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas · a9ff785e
      Cliff Wickman authored
      A panic can be caused by simply cat'ing /proc/<pid>/smaps while an
      application has a VM_PFNMAP range.  It happened in-house when a
      benchmarker was trying to decipher the memory layout of his program.
      
      /proc/<pid>/smaps and similar walks through a user page table should not
      be looking at VM_PFNMAP areas.
      
      Certain tests in walk_page_range() (specifically split_huge_page_pmd())
      assume that all the mapped PFN's are backed with page structures.  And
      this is not usually true for VM_PFNMAP areas.  This can result in panics
      on kernel page faults when attempting to address those page structures.
      
      There are a half dozen callers of walk_page_range() that walk through a
      task's entire page table (as N.  Horiguchi pointed out).  So rather than
      change all of them, this patch changes just walk_page_range() to ignore
      VM_PFNMAP areas.
      
      The logic of hugetlb_vma() is moved back into walk_page_range(), as we
      want to test any vma in the range.
      
      VM_PFNMAP areas are used by:
      - graphics memory manager   gpu/drm/drm_gem.c
      - global reference unit     sgi-gru/grufile.c
      - sgi special memory        char/mspec.c
      - and probably several out-of-tree modules
      
      [akpm@linux-foundation.org: remove now-unused hugetlb_vma() stub]
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: David Sterba <dsterba@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a9ff785e
    • Tomasz Figa's avatar
      drivers/rtc/rtc-max8998.c: check for pdata presence before dereferencing · 43c523bf
      Tomasz Figa authored
      Currently the driver can crash with a NULL pointer dereference if no
      pdata is provided, despite of successful registration of the MFD part.
      This patch fixes the problem by adding a NULL check before dereferencing
      the pdata pointer.
      Signed-off-by: default avatarTomasz Figa <t.figa@samsung.com>
      Signed-off-by: default avatarKyungmin Park <kyungmin.park@samsung.com>
      Cc: Sachin Kamat <sachin.kamat@linaro.org>
      Reviewed-by: default avatarJingoo Han <jg1.han@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      43c523bf
    • Joseph Qi's avatar
      ocfs2: goto out_unlock if ocfs2_get_clusters_nocache() failed in ocfs2_fiemap() · b4ca2b4b
      Joseph Qi authored
      Last time we found there is lock/unlock bug in ocfs2_file_aio_write, and
      then we did a thorough search for all lock resources in
      ocfs2_inode_info, including rw, inode and open lockres and found this
      bug.  My kernel version is 3.0.13, and it is also in the lastest version
      3.9.  In ocfs2_fiemap, once ocfs2_get_clusters_nocache failed, it should
      goto out_unlock instead of out, because we need release buffer head, up
      read alloc sem and unlock inode.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: default avatarJie Liu <jeff.liu@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Acked-by: default avatarSunil Mushran <sunil.mushran@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b4ca2b4b
    • Jiri Kosina's avatar
      random: fix accounting race condition with lockless irq entropy_count update · 10b3a32d
      Jiri Kosina authored
      Commit 902c098a ("random: use lockless techniques in the interrupt
      path") turned IRQ path from being spinlock protected into lockless
      cmpxchg-retry update.
      
      That commit removed r->lock serialization between crediting entropy bits
      from IRQ context and accounting when extracting entropy on userspace
      read path, but didn't turn the r->entropy_count reads/updates in
      account() to use cmpxchg as well.
      
      It has been observed, that under certain circumstances this leads to
      read() on /dev/urandom to return 0 (EOF), as r->entropy_count gets
      corrupted and becomes negative, which in turn results in propagating 0
      all the way from account() to the actual read() call.
      
      Convert the accounting code to be the proper lockless counterpart of
      what has been partially done by 902c098a.
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Cc: Greg KH <greg@kroah.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      10b3a32d
    • Jarod Wilson's avatar
      drivers/char/random.c: fix priming of last_data · 1e7e2e05
      Jarod Wilson authored
      Commit ec8f02da ("random: prime last_data value per fips
      requirements") added priming of last_data per fips requirements.
      
      Unfortuantely, it did so in a way that can lead to multiple threads all
      incrementing nbytes, but only one actually doing anything with the extra
      data, which leads to some fun random corruption and panics.
      
      The fix is to simply do everything needed to prime last_data in a single
      shot, so there's no window for multiple cpus to increment nbytes -- in
      fact, we won't even increment or decrement nbytes anymore, we'll just
      extract the needed EXTRACT_SIZE one time per pool and then carry on with
      the normal routine.
      
      All these changes have been tested across multiple hosts and
      architectures where panics were previously encoutered.  The code changes
      are are strictly limited to areas only touched when when booted in fips
      mode.
      
      This change should also go into 3.8-stable, to make the myriads of fips
      users on 3.8.x happy.
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Tested-by: default avatarJan Stancek <jstancek@redhat.com>
      Tested-by: default avatarJan Stodola <jstodola@redhat.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1e7e2e05
    • Randy Dunlap's avatar
      mm/memory_hotplug.c: fix printk format warnings · 348f9f05
      Randy Dunlap authored
      Fix printk format warnings in mm/memory_hotplug.c by using "%pa":
      
        mm/memory_hotplug.c: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 2 has type 'resource_size_t' [-Wformat]
        mm/memory_hotplug.c: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'resource_size_t' [-Wformat]
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      348f9f05
    • Ryusuke Konishi's avatar
      nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary · 136e8770
      Ryusuke Konishi authored
      nilfs2: fix issue of nilfs_set_page_dirty for page at EOF boundary
      
      DESCRIPTION:
       There are use-cases when NILFS2 file system (formatted with block size
      lesser than 4 KB) can be remounted in RO mode because of encountering of
      "broken bmap" issue.
      
      The issue was reported by Anthony Doggett <Anthony2486@interfaces.org.uk>:
       "The machine I've been trialling nilfs on is running Debian Testing,
        Linux version 3.2.0-4-686-pae (debian-kernel@lists.debian.org) (gcc
        version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.35-2), but I've
        also reproduced it (identically) with Debian Unstable amd64 and Debian
        Experimental (using the 3.8-trunk kernel).  The problematic partitions
        were formatted with "mkfs.nilfs2 -b 1024 -B 8192"."
      
      SYMPTOMS:
      (1) System log contains error messages likewise:
      
          [63102.496756] nilfs_direct_assign: invalid pointer: 0
          [63102.496786] NILFS error (device dm-17): nilfs_bmap_assign: broken bmap (inode number=28)
          [63102.496798]
          [63102.524403] Remounting filesystem read-only
      
      (2) The NILFS2 file system is remounted in RO mode.
      
      REPRODUSING PATH:
      (1) Create volume group with name "unencrypted" by means of vgcreate utility.
      (2) Run script (prepared by Anthony Doggett <Anthony2486@interfaces.org.uk>):
      
      ----------------[BEGIN SCRIPT]--------------------
      
      VG=unencrypted
      lvcreate --size 2G --name ntest $VG
      mkfs.nilfs2 -b 1024 -B 8192 /dev/mapper/$VG-ntest
      mkdir /var/tmp/n
      mkdir /var/tmp/n/ntest
      mount /dev/mapper/$VG-ntest /var/tmp/n/ntest
      mkdir /var/tmp/n/ntest/thedir
      cd /var/tmp/n/ntest/thedir
      sleep 2
      date
      darcs init
      sleep 2
      dmesg|tail -n 5
      date
      darcs whatsnew || true
      date
      sleep 2
      dmesg|tail -n 5
      ----------------[END SCRIPT]--------------------
      
      REPRODUCIBILITY: 100%
      
      INVESTIGATION:
      As it was discovered, the issue takes place during segment
      construction after executing such sequence of user-space operations:
      
        open("_darcs/index", O_RDWR|O_CREAT|O_NOCTTY, 0666) = 7
        fstat(7, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
        ftruncate(7, 60)
      
      The error message "NILFS error (device dm-17): nilfs_bmap_assign: broken
      bmap (inode number=28)" takes place because of trying to get block
      number for third block of the file with logical offset #3072 bytes.  As
      it is possible to see from above output, the file has 60 bytes of the
      whole size.  So, it is enough one block (1 KB in size) allocation for
      the whole file.  Trying to operate with several blocks instead of one
      takes place because of discovering several dirty buffers for this file
      in nilfs_segctor_scan_file() method.
      
      The root cause of this issue is in nilfs_set_page_dirty function which
      is called just before writing to an mmapped page.
      
      When nilfs_page_mkwrite function handles a page at EOF boundary, it
      fills hole blocks only inside EOF through __block_page_mkwrite().
      
      The __block_page_mkwrite() function calls set_page_dirty() after filling
      hole blocks, thus nilfs_set_page_dirty function (=
      a_ops->set_page_dirty) is called.  However, the current implementation
      of nilfs_set_page_dirty() wrongly marks all buffers dirty even for page
      at EOF boundary.
      
      As a result, buffers outside EOF are inconsistently marked dirty and
      queued for write even though they are not mapped with nilfs_get_block
      function.
      
      FIX:
      This modifies nilfs_set_page_dirty() not to mark hole blocks dirty.
      
      Thanks to Vyacheslav Dubeyko for his effort on analysis and proposals
      for this issue.
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Reported-by: default avatarAnthony Doggett <Anthony2486@interfaces.org.uk>
      Reported-by: default avatarVyacheslav Dubeyko <slava@dubeyko.com>
      Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      136e8770