Commits · fb770ae414d018255afa7a70b14ba1f8620762dd · nexedi / linux

26 Jul, 2016 5 commits

Btrfs: fix read_node_slot to return errors · fb770ae4

Liu Bo authored Jul 05, 2016

We use read_node_slot() to read btree node and it has two cases,
a) slot is out of range, which means 'no such entry'
b) we fail to read the block, due to checksum fails or corrupted
   content or not with uptodate flag.
But we're returning NULL in both cases, this makes it return -ENOENT
in case a) and return -EIO in case b), and this fixes its callers
as well as btrfs_search_forward() 's caller to catch the new errors.

The problem is reported by Peter Becker, and I can manage to
hit the same BUG_ON by mounting my fuzz image.
Reported-by: Peter Becker <floyd.net@gmail.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

fb770ae4

Btrfs: fix double free of fs root · 876d2cf1

Liu Bo authored Jun 28, 2016

I got this warning while mounting a btrfs image,

[ 3020.509606] ------------[ cut here ]------------
[ 3020.510107] WARNING: CPU: 3 PID: 5581 at lib/idr.c:1051 ida_remove+0xca/0x190
[ 3020.510853] ida_remove called for id=42 which is not allocated.
[ 3020.511466] Modules linked in:
[ 3020.511802] CPU: 3 PID: 5581 Comm: mount Not tainted 4.7.0-rc5+ #274
[ 3020.512438] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
[ 3020.513385]  0000000000000286 0000000021295d86 ffff88006c66b8f0 ffffffff8182ba5a
[ 3020.514153]  0000000000000000 0000000000000009 ffff88006c66b930 ffffffff810e0ed7
[ 3020.514928]  0000041b00000000 ffffffff8289a8c0 ffff88007f437880 0000000000000000
[ 3020.515717] Call Trace:
[ 3020.515965]  [<ffffffff8182ba5a>] dump_stack+0xc9/0x13f
[ 3020.516487]  [<ffffffff810e0ed7>] __warn+0x147/0x160
[ 3020.517005]  [<ffffffff810e0f4f>] warn_slowpath_fmt+0x5f/0x80
[ 3020.517572]  [<ffffffff8182e6ca>] ida_remove+0xca/0x190
[ 3020.518075]  [<ffffffff813a2bcc>] free_anon_bdev+0x2c/0x60
[ 3020.518609]  [<ffffffff81657a9f>] free_fs_root+0x13f/0x160
[ 3020.519138]  [<ffffffff8165c679>] btrfs_get_fs_root+0x379/0x3d0
[ 3020.519710]  [<ffffffff81e6e975>] ? __mutex_unlock_slowpath+0x155/0x2c0
[ 3020.520366]  [<ffffffff816615b1>] open_ctree+0x2e91/0x3200
[ 3020.520965]  [<ffffffff8161ede2>] btrfs_mount+0x1322/0x15b0
[ 3020.521536]  [<ffffffff81e60e74>] ? kmemleak_alloc_percpu+0x44/0x170
[ 3020.522167]  [<ffffffff8115f5e1>] ? lockdep_init_map+0x61/0x210
[ 3020.522780]  [<ffffffff813a4f59>] mount_fs+0x49/0x2c0
[ 3020.523305]  [<ffffffff813d840c>] vfs_kern_mount+0xac/0x1b0
[ 3020.523872]  [<ffffffff8161dee1>] btrfs_mount+0x421/0x15b0
[ 3020.524402]  [<ffffffff81e60e74>] ? kmemleak_alloc_percpu+0x44/0x170
[ 3020.525045]  [<ffffffff8115f5e1>] ? lockdep_init_map+0x61/0x210
[ 3020.525657]  [<ffffffff8115f5e1>] ? lockdep_init_map+0x61/0x210
[ 3020.526289]  [<ffffffff813a4f59>] mount_fs+0x49/0x2c0
[ 3020.526803]  [<ffffffff813d840c>] vfs_kern_mount+0xac/0x1b0
[ 3020.527365]  [<ffffffff813dc27a>] do_mount+0x41a/0x1770
[ 3020.527899]  [<ffffffff812e800d>] ? strndup_user+0x6d/0xc0
[ 3020.528447]  [<ffffffff812e7f68>] ? memdup_user+0x78/0xb0
[ 3020.528987]  [<ffffffff813ddad0>] SyS_mount+0x150/0x160
[ 3020.529493]  [<ffffffff81e72b7c>] entry_SYSCALL_64_fastpath+0x1f/0xbd

It turns out that we free fs root twice, btrfs_init_fs_root() calls
free_anon_bdev(root->anon_dev) and later then btrfs_get_fs_root() cals
free_fs_root which does another free_anon_bdev() and it ends up with the
above warning.

Instead of reset root->anon_dev to 0 after free_anon_bdev(), we can let
btrfs_init_fs_root() return directly since its callers have already done
the free job by calling free_fs_root().
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: David Sterba <dsterba@suse.com>

876d2cf1

Btrfs: error out if generic_bin_search get invalid arguments · 5e24e9af

Liu Bo authored Jun 23, 2016

With btrfs-corrupt-block, one can set btree node/leaf's field, if
we assign a negative value to node/leaf, we can get various hangs,
eg. if extent_root's nritems is -2ULL, then we get stuck in
 btrfs_read_block_groups() because it has a while loop and
btrfs_search_slot() on extent_root will always return the first
 child.

This lets us know what's happening and returns a EINVAL to callers
instead of returning the first item.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

5e24e9af

Btrfs: check inconsistence between chunk and block group · 6fb37b75

Liu Bo authored Jun 22, 2016

With btrfs-corrupt-block, one can drop one chunk item and mounting
will end up with a panic in btrfs_full_stripe_len().

This doesn't not remove the BUG_ON, but instead checks it a bit
earlier when we find the block group item.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

6fb37b75

btrfs: add missing bytes_readonly attribute file in sysfs · c1fd5c30

Wang Xiaoguang authored Jun 21, 2016

Signed-off-by: Wang Xiaoguang <wangxg.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>

c1fd5c30

21 Jul, 2016 1 commit

Btrfs: fix delalloc accounting after copy_from_user faults · 8b8b08cb

Chris Mason authored Jul 19, 2016

Commit 56244ef1 was almost but not quite enough to fix the
reservation math after btrfs_copy_from_user returned partial copies.

Some users are still seeing warnings in btrfs_destroy_inode, and with a
long enough test run I'm able to trigger them as well.

This patch fixes the accounting math again, bringing it much closer to
the way it was before the sectorsize conversion Chandan did.  The
problem is accounting for the offset into the page/sector when we do a
partial copy.  This one just uses the dirty_sectors variable which
should already be updated properly.
Signed-off-by: Chris Mason <clm@fb.com>
cc: stable@vger.kernel.org # v4.6+

8b8b08cb

20 Jul, 2016 2 commits

Btrfs: avoid deadlocks during reservations in btrfs_truncate_block · bac357dc

Josef Bacik authored Jul 20, 2016

The new enospc code makes it possible to deadlock if we don't use
FLUSH_LIMIT during reservations inside a transaction.  This enforces
the correct flush type to avoid both deadlocks and assertions
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>

bac357dc

Merge branch 'kdave-part1-enospc' into for-linus-4.8 · 24502ebb
Chris Mason authored Jul 20, 2016

24502ebb

07 Jul, 2016 18 commits

Btrfs: use FLUSH_LIMIT for relocation in reserve_metadata_bytes · 8ca17f0f

Josef Bacik authored May 27, 2016

We used to allow you to set FLUSH_ALL and then just wouldn't do things like
commit transactions or wait on ordered extents if we noticed you were in a
transaction. However now that all the flushing for FLUSH_ALL is asynchronous
we've lost the ability to tell, and we could end up deadlocking. So instead use
FLUSH_LIMIT in reserve_metadata_bytes in relocation and then return -EAGAIN if
we error out to preserve the previous behavior. I've also added an ASSERT() to
catch anybody else who tries to do this. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

8ca17f0f

Btrfs: fill relocation block rsv after allocation · ac2fabac

Josef Bacik authored May 27, 2016

Since we set the reloc control before we've reserved our space for relocation we
could race with a root being dirtied and not actually have space to do our init
reloc root. So once we've allocated it and set it up go ahead and make our
reservation before setting the relocate control, that way anybody who tries to
do the reloc root init has space to use. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

ac2fabac

Btrfs: always use trans->block_rsv for orphans · 40acc3ee

Josef Bacik authored May 27, 2016

This is the case all the time anyway except for relocation which could be doing
a reloc root for a non ref counted root, in which case we'd end up with some
random block rsv rather than the one we have our reservation in. If there isn't
enough space in the block rsv we are trying to steal from we'll BUG() because we
expect there to be space for the orphan to make its reservation. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

40acc3ee

Btrfs: change how we calculate the global block rsv · ae2e4728

Josef Bacik authored May 27, 2016

Traditionally we've calculated the global block rsv by guessing how much of the
metadata used amount was the extent tree, and then taking the data size and
figuring out how large the csum tree would have to be to hold that much data.

This is imprecise and falls down on MIXED file systems as we can't trust the
data used amount. This resulted in failures for xfstests generic/333 because it
creates lots of clones, which explodes out the extent tree. Our global reserve
calculations were woefully inaccurate in this case which meant we got into a
situation where we did not have enough reserved to do our work.

We know we only use the global block rsv for the extent, csum, and root trees,
so just get the bytes used for these trees and use that as the basis of our
global reserve. Since these are not reference counted trees the bytes_used
value will be accurate. This fixed the transaction aborts seen with
generic/333. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

ae2e4728

Btrfs: use root when checking need_async_flush · 87241c2e

Josef Bacik authored Apr 25, 2016

Instead of doing fs_info->fs_root in need_async_flush, which may not be set
during recovery when mounting, just pass the root itself in, which makes more
sense as thats what btrfs_calc_reclaim_metadata_size takes.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reported-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>

87241c2e

Btrfs: don't bother kicking async if there's nothing to reclaim · d38b349c

Josef Bacik authored Mar 25, 2016

We do this check when we start the async reclaimer thread, might as well check
before we kick it off to save us some cycles.  Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

d38b349c

Btrfs: fix release reserved extents trace points · 31bada7c

Josef Bacik authored Mar 25, 2016

We were doing trace_btrfs_release_reserved_extent() in pin_down_extent which
isn't quite right because we will go through and free that extent later when we
unpin, so it messes up apps that are accounting for the reservation space. We
were also unconditionally doing it in __btrfs_free_reserved_extent(), when we
only actually free the reservation instead of pinning the extent. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

31bada7c

Btrfs: add fsid to some tracepoints · dce3afa5

Josef Bacik authored Mar 25, 2016

When tracing enospc problems on a box with multiple file systems mounted I need
to be able to differentiate between the two file systems. Most of the important
trace points I'm looking at already have an fsid, but the reserved extent trace
points do not, so add that to make it possible to figure out which trace point
belongs to which file system. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

dce3afa5

Btrfs: add tracepoints for flush events · f376df2b

Josef Bacik authored Mar 25, 2016

We want to track when we're triggering flushing from our reservation code and
what flushing is being done when we start flushing.  Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

f376df2b

Btrfs: fix delalloc reservation amount tracepoint · f485c9ee

Josef Bacik authored Mar 25, 2016

We can sometimes drop the reservation we had for our inode, so we need to remove
that amount from to_reserve so that our tracepoint reports a valid amount of
space.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

f485c9ee

Btrfs: trace pinned extents · c51e7bb1

Josef Bacik authored Mar 25, 2016

Pinned extents are an important metric to keep track of for enospc.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

c51e7bb1

Btrfs: introduce ticketed enospc infrastructure · 957780eb

Josef Bacik authored May 17, 2016

Our enospc flushing sucks. It is born from a time where we were early
enospc'ing constantly because multiple threads would race in for the same
reservation and randomly starve other ones out. So I came up with this solution
to block any other reservations from happening while one guy tried to flush
stuff to satisfy his reservation. This gives us pretty good correctness, but
completely crap latency.

The solution I've come up with is ticketed reservations. Basically we try to
make our reservation, and if we can't we put a ticket on a list in order and
kick off an async flusher thread. This async flusher thread does the same old
flushing we always did, just asynchronously. As space is freed and added back
to the space_info it checks and sees if we have any tickets that need
satisfying, and adds space to the tickets and wakes up anything we've satisfied.

Once the flusher thread stops making progress it wakes up all the current
tickets and tells them to take a hike.

There is a priority list for things that can't flush, since the async flusher
could do anything we need to avoid deadlocks. These guys get priority for
having their reservation made, and will still do manual flushing themselves in
case the async flusher isn't running.

This patch gives us significantly better latencies. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

957780eb

Btrfs: add tracepoint for adding block groups · c83f8eff

Josef Bacik authored Mar 25, 2016

I'm writing a tool to visualize the enospc system inside btrfs, I need this
tracepoint in order to keep track of the block groups in the system.  Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

c83f8eff

Btrfs: warn_on for unaccounted spaces · d555b6c3

Josef Bacik authored Mar 25, 2016

These were hidden behind enospc_debug, which isn't helpful as they indicate
actual bugs, unlike the rest of the enospc_debug stuff which is really debug
information.  Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

d555b6c3

Btrfs: change delayed reservation fallback behavior · c48f49d6

Josef Bacik authored Mar 25, 2016

We reserve space for the inode update when we first reserve space for writing to
a file. However there are lots of ways that we can use this reservation and not
have it for subsequent ordered extents. Previously we'd fall through and try to
reserve metadata bytes for this, then we'd just steal the full reservation from
the delalloc_block_rsv, and if that didn't have enough space we'd steal the full
reservation from the global reserve. The problem with this is we can easily
just return ENOSPC and fallback to updating the inode item directly. In the
worst case (assuming 4k nodesize) we'd steal 64kib from the global reserve if we
fall all the way through, however if we just fallback and update the inode
directly we'd only steal 4k * BTRFS_PATH_MAX in the worst case which is 32kib.

We would have also just added the extent item for the inode so we likely will
have already cow'ed down most of the way to the leaf containing the inode item,
so we are more often than not only need one or two nodesize's worth of
reservations. Given the reservation for the extent itself is also a worst case
we will likely already have space to cover the inode update.

This change will make us behave better in the theoretical worst case, and much
better in the case that we don't have our reservation and cannot reserve more
metadata. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

c48f49d6

Btrfs: always reserve metadata for delalloc extents · 48c3d480

Josef Bacik authored Mar 25, 2016

There are a few races in the metadata reservation stuff. First we add the bytes
to the block_rsv well after we've set the bit on the inode saying that we have
space for it and after we've reserved the bytes. So use the normal
btrfs_block_rsv_add helper for this case. Secondly we can flush delalloc
extents when we try to reserve space for our write, which means that we could
have used up the space for the inode and we wouldn't know because we only check
before the reservation. So instead make sure we are always reserving space for
the inode update, and then if we don't need it release those bytes afterward.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>

48c3d480

Btrfs: fix callers of btrfs_block_rsv_migrate · 25d609f8

Josef Bacik authored Mar 25, 2016

So btrfs_block_rsv_migrate just unconditionally calls block_rsv_migrate_bytes.
Not only this but it unconditionally changes the size of the block_rsv. This
isn't a bug strictly speaking, but it makes truncate block rsv's look funny
because every time we migrate bytes over its size grows, even though we only
want it to be a specific size. So collapse this into one function that takes an
update_size argument and make truncate and evict not update the size for
consistency sake. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

25d609f8

Btrfs: add bytes_readonly to the spaceinfo at once · e40edf2d

Josef Bacik authored Mar 25, 2016

For some reason we're adding bytes_readonly to the space info after we update
the space info with the block group info. This creates a tiny race where we
could over-reserve space because we haven't yet taken out the bytes_readonly
bit. Since we already know this information at the time we call
update_space_info, just pass it along so it can be updated all at once. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>

e40edf2d

04 Jul, 2016 1 commit
- Linux 4.7-rc6 · a99cde43
  Linus Torvalds authored Jul 03, 2016
  
  a99cde43
03 Jul, 2016 5 commits

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 0b295dd5

Linus Torvalds authored Jul 03, 2016

Pull fuse fix from Miklos Szeredi:
 "This makes sure userspace filesystems are not broken by the parallel
  lookups and readdir feature"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: serialize dirops by default

0b295dd5

Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · 236bfd8e

Linus Torvalds authored Jul 03, 2016

Pull overlayfs fixes from Miklos Szeredi:
 "This contains fixes for a dentry leak, a regression in 4.6 noticed by
  Docker users and missing write access checking in truncate"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: warn instead of error if d_type is not supported
  ovl: get_write_access() in truncate
  ovl: fix dentry leak for default_permissions

236bfd8e

ovl: warn instead of error if d_type is not supported · e7c0b599

Vivek Goyal authored Jul 01, 2016

overlay needs underlying fs to support d_type. Recently I put in a
patch in to detect this condition and started failing mount if
underlying fs did not support d_type.

But this breaks existing configurations over kernel upgrade. Those who
are running docker (partially broken configuration) with xfs not
supporting d_type, are surprised that after kernel upgrade docker does
not run anymore.

https://github.com/docker/docker/issues/22937#issuecomment-229881315

So instead of erroring out, detect broken configuration and warn
about it. This should allow existing docker setups to continue
working after kernel upgrade.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 45aebeaf ("ovl: Ensure upper filesystem supports d_type")
Cc: <stable@vger.kernel.org> 4.6

e7c0b599

Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 4f302921

Linus Torvalds authored Jul 02, 2016

Pull MIPS fix from Ralf Baechle:
 "Only a single fix for 4.7 pending at this point.  It fixes an issue
  that may lead to corruption of the cache mode bits in the page table"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Fix possible corruption of cache mode by mprotect.

4f302921

Merge tag 'powerpc-4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 70bd68d7

Linus Torvalds authored Jul 02, 2016

Pull powerpc fixes from Michael Ellerman:

 - tm: Always reclaim in start_thread() for exec() class syscalls from
   Cyril Bur

 - tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 from Michael
   Neuling

 - eeh: Fix wrong argument passed to eeh_rmv_device() from Gavin Shan

 - Initialise pci_io_base as early as possible from Darren Stevens

* tag 'powerpc-4.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Initialise pci_io_base as early as possible
  powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0
  powerpc/eeh: Fix wrong argument passed to eeh_rmv_device()
  powerpc/tm: Always reclaim in start_thread() for exec() class syscalls

70bd68d7

02 Jul, 2016 6 commits

Merge tag 'drm-fixes-for-v4.7-rc6' of git://people.freedesktop.org/~airlied/linux · 99b0f54e

Linus Torvalds authored Jul 02, 2016

Pull drm fixes frlm Dave Airlie:
 "Just some AMD and Intel fixes, the AMD ones are further production
  Polaris fixes, and the Intel ones fix some early timeouts, some PCI ID
  changes and a couple of other fixes.

  Still a bit Internet challenged here, hopefully end of next week will
  solve it"

* tag 'drm-fixes-for-v4.7-rc6' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: Fix missing unlock on error in i915_ppgtt_info()
  drm/amd/powerplay: workaround for UVD clock issue
  drm/amdgpu: add ACLK_CNTL setting for polaris10
  drm/amd/powerplay: fix issue uvd dpm can't enabled on Polaris11.
  drm/amd/powerplay: Workaround for Memory EDC Error on Polaris10.
  drm/i915: Removing PCI IDs that are no longer listed as Kabylake.
  drm/i915: Add more Kabylake PCI IDs.
  drm/i915: Avoid early timeout during AUX transfers
  drm/i915/hsw: Avoid early timeout during LCPLL disable/restore
  drm/i915/lpt: Avoid early timeout during FDI PHY reset
  drm/i915/bxt: Avoid early timeout during PLL enable
  drm/i915: Refresh cached DP port register value on resume
  drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation
  drm/amd/powerplay: disable FFC.
  drm/amd/powerplay: add some definition for FFC feature on polaris.

99b0f54e

Merge tag 'spi-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 467ce769

Linus Torvalds authored Jul 02, 2016

Pull spi fixes from Mark Brown:
 "A few small driver-specific fixes for SPI, all in the normal important
  if you hit them category especially the rockchip driver fix which
  addresses a race which has been exposed more frequently with some
  recent performance improvements"

* tag 'spi-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: sunxi: fix transfer timeout
  spi: sun4i: fix FIFO limit
  spi: rockchip: Signal unfinished DMA transfers
  spi: spi-ti-qspi: Suspend the queue before removing the device

467ce769

Merge tag 'regulator-fix-v4.7-rc5' of... · a2b0db5b

Linus Torvalds authored Jul 02, 2016

Merge tag 'regulator-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "Two small fixes for the regulator subsystem - one fixing a crash with
  one of the devices supported by the max77620 driver, another fixing
  startup for the anatop regulator when it starts up with the regulator
  in bypass mode"

* tag 'regulator-fix-v4.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: max77620: check for valid regulator info
  regulator: anatop: allow regulator to be in bypass mode

a2b0db5b

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 44385120

Linus Torvalds authored Jul 02, 2016

Pull clk fixes from Stephen Boyd:
 "A small fix for the newly added oxnas clk driver and a handful of
  rockchip clk driver fixes for newly added rk3399 support"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: Fix return value check in oxnas_stdclk_probe()
  clk: rockchip: release io resource when failing to init clk on rk3399
  clk: rockchip: fix cpuclk registration error handling
  clk: rockchip: Revert "clk: rockchip: reset init state before mmc card initialization"
  clk: rockchip: fix incorrect parent for rk3399's {c,g}pll_aclk_perihp_src
  clk: rockchip: mark rk3399 GIC clocks as critical
  clk: rockchip: initialize flags of clk_init_data in mmc-phase clock

44385120

Merge tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel into drm-fixes · 88c08710

Dave Airlie authored Jul 02, 2016

here's a batch of i915 fixes for 4.7.

* tag 'drm-intel-fixes-2016-06-30' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Fix missing unlock on error in i915_ppgtt_info()
  drm/i915: Removing PCI IDs that are no longer listed as Kabylake.
  drm/i915: Add more Kabylake PCI IDs.
  drm/i915: Avoid early timeout during AUX transfers
  drm/i915/hsw: Avoid early timeout during LCPLL disable/restore
  drm/i915/lpt: Avoid early timeout during FDI PHY reset
  drm/i915/bxt: Avoid early timeout during PLL enable
  drm/i915: Refresh cached DP port register value on resume

88c08710

Merge branch 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 40793e85

Dave Airlie authored Jul 02, 2016

Just a few more late fixes for Polaris cards.

* 'drm-fixes-4.7' of git://people.freedesktop.org/~agd5f/linux:
  drm/amd/powerplay: workaround for UVD clock issue
  drm/amdgpu: add ACLK_CNTL setting for polaris10
  drm/amd/powerplay: fix issue uvd dpm can't enabled on Polaris11.
  drm/amd/powerplay: Workaround for Memory EDC Error on Polaris10.
  drm/amd/powerplay: Update CKS on/ CKS off voltage offset calculation
  drm/amd/powerplay: disable FFC.
  drm/amd/powerplay: add some definition for FFC feature on polaris.

40793e85

01 Jul, 2016 2 commits

MIPS: Fix possible corruption of cache mode by mprotect. · 6d037de9

Ralf Baechle authored Jul 01, 2016

The following testcase may result in a page table entries with a invalid
CCA field being generated:

static void *bindstack;

static int sysrqfd;

static void protect_low(int protect)
{
	mprotect(bindstack, BINDSTACK_SIZE, protect);
}

static void sigbus_handler(int signal, siginfo_t * info, void *context)
{
	void *addr = info->si_addr;

	write(sysrqfd, "x", 1);

	printf("sigbus, fault address %p (should not happen, but might)\n",
	       addr);
	abort();
}

static void run_bind_test(void)
{
	unsigned int *p = bindstack;

	p[0] = 0xf001f001;

	write(sysrqfd, "x", 1);

	/* Set trap on access to p[0] */
	protect_low(PROT_NONE);

	write(sysrqfd, "x", 1);

	/* Clear trap on access to p[0] */
	protect_low(PROT_READ | PROT_WRITE | PROT_EXEC);

	write(sysrqfd, "x", 1);

	/* Check the contents of p[0] */
	if (p[0] != 0xf001f001) {
		write(sysrqfd, "x", 1);

		/* Reached, but shouldn't be */
		printf("badness, shouldn't happen but does\n");
		abort();
	}
}

int main(void)
{
	struct sigaction sa;

	sysrqfd = open("/proc/sysrq-trigger", O_WRONLY);

	if (sigprocmask(SIG_BLOCK, NULL, &sa.sa_mask)) {
		perror("sigprocmask");
		return 0;
	}

	sa.sa_sigaction = sigbus_handler;
	sa.sa_flags = SA_SIGINFO | SA_NODEFER | SA_RESTART;
	if (sigaction(SIGBUS, &sa, NULL)) {
		perror("sigaction");
		return 0;
	}

	bindstack = mmap(NULL,
			 BINDSTACK_SIZE,
			 PROT_READ | PROT_WRITE | PROT_EXEC,
			 MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
	if (bindstack == MAP_FAILED) {
		perror("mmap bindstack");
		return 0;
	}

	printf("bindstack: %p\n", bindstack);

	run_bind_test();

	printf("done\n");

	return 0;
}

There are multiple ingredients for this:

 1) PAGE_NONE is defined to _CACHE_CACHABLE_NONCOHERENT, which is CCA 3
    on all platforms except SB1 where it's CCA 5.
 2) _page_cachable_default must have bits set which are not set
    _CACHE_CACHABLE_NONCOHERENT.
 3) Either the defective version of pte_modify for XPA or the standard
    version must be in used.  However pte_modify for the 36 bit address
    space support is no affected.

In that case additional bits in the final CCA mode may generate an invalid
value for the CCA field.  On the R10000 system where this was tracked
down for example a CCA 7 has been observed, which is Uncached Accelerated.

Fixed by:

 1) Using the proper CCA mode for PAGE_NONE just like for all the other
    PAGE_* pte/pmd bits.
 2) Fix the two affected variants of pte_modify.

Further code inspection also shows the same issue to exist in pmd_modify
which would affect huge page systems.

Issue in pte_modify tracked down by Alastair Bridgewater, PAGE_NONE
and pmd_modify issue found by me.

The history of this goes back beyond Linus' git history.  Chris Dearman's
commit 35133692 ("[MIPS] Allow setting of
the cache attribute at run time.") missed the opportunity to fix this
but it was originally introduced in lmo commit
d523832cf12007b3242e50bb77d0c9e63e0b6518 ("Missing from last commit.")
and 32cc38229ac7538f2346918a09e75413e8861f87 ("New configuration option
CONFIG_MIPS_UNCACHED.")
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>

6d037de9

Merge tag 'acpi-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · dbdc3bb7

Linus Torvalds authored Jul 01, 2016

Pull ACPI fix from Rafael Wysocki:
 "Fix an expression in the ACPI PCI IRQ management code added by a
  recent commit that overlooked missing parens in it, so the result of
  the computation is incorrect in some cases (Sinan Kaya)"

* tag 'acpi-4.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI,PCI,IRQ: correct operator precedence

dbdc3bb7