1. 06 Jan, 2016 12 commits
    • David Howells's avatar
      KEYS: Don't permit request_key() to construct a new keyring · ef301604
      David Howells authored
      commit 911b79cd upstream.
      
      If request_key() is used to find a keyring, only do the search part - don't
      do the construction part if the keyring was not found by the search.  We
      don't really want keyrings in the negative instantiated state since the
      rejected/negative instantiation error value in the payload is unioned with
      keyring metadata.
      
      Now the kernel gives an error:
      
      	request_key("keyring", "#selinux,bdekeyring", "keyring", KEY_SPEC_USER_SESSION_KEYRING) = -1 EPERM (Operation not permitted)
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ef301604
    • David Howells's avatar
      KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring · 4ad79b3a
      David Howells authored
      commit f05819df upstream.
      
      The following sequence of commands:
      
          i=`keyctl add user a a @s`
          keyctl request2 keyring foo bar @t
          keyctl unlink $i @s
      
      tries to invoke an upcall to instantiate a keyring if one doesn't already
      exist by that name within the user's keyring set.  However, if the upcall
      fails, the code sets keyring->type_data.reject_error to -ENOKEY or some
      other error code.  When the key is garbage collected, the key destroy
      function is called unconditionally and keyring_destroy() uses list_empty()
      on keyring->type_data.link - which is in a union with reject_error.
      Subsequently, the kernel tries to unlink the keyring from the keyring names
      list - which oopses like this:
      
      	BUG: unable to handle kernel paging request at 00000000ffffff8a
      	IP: [<ffffffff8126e051>] keyring_destroy+0x3d/0x88
      	...
      	Workqueue: events key_garbage_collector
      	...
      	RIP: 0010:[<ffffffff8126e051>] keyring_destroy+0x3d/0x88
      	RSP: 0018:ffff88003e2f3d30  EFLAGS: 00010203
      	RAX: 00000000ffffff82 RBX: ffff88003bf1a900 RCX: 0000000000000000
      	RDX: 0000000000000000 RSI: 000000003bfc6901 RDI: ffffffff81a73a40
      	RBP: ffff88003e2f3d38 R08: 0000000000000152 R09: 0000000000000000
      	R10: ffff88003e2f3c18 R11: 000000000000865b R12: ffff88003bf1a900
      	R13: 0000000000000000 R14: ffff88003bf1a908 R15: ffff88003e2f4000
      	...
      	CR2: 00000000ffffff8a CR3: 000000003e3ec000 CR4: 00000000000006f0
      	...
      	Call Trace:
      	 [<ffffffff8126c756>] key_gc_unused_keys.constprop.1+0x5d/0x10f
      	 [<ffffffff8126ca71>] key_garbage_collector+0x1fa/0x351
      	 [<ffffffff8105ec9b>] process_one_work+0x28e/0x547
      	 [<ffffffff8105fd17>] worker_thread+0x26e/0x361
      	 [<ffffffff8105faa9>] ? rescuer_thread+0x2a8/0x2a8
      	 [<ffffffff810648ad>] kthread+0xf3/0xfb
      	 [<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
      	 [<ffffffff815f2ccf>] ret_from_fork+0x3f/0x70
      	 [<ffffffff810647ba>] ? kthread_create_on_node+0x1c2/0x1c2
      
      Note the value in RAX.  This is a 32-bit representation of -ENOKEY.
      
      The solution is to only call ->destroy() if the key was successfully
      instantiated.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4ad79b3a
    • David Howells's avatar
      KEYS: Fix race between key destruction and finding a keyring by name · 8978f9d0
      David Howells authored
      commit 94c4554b upstream.
      
      There appears to be a race between:
      
       (1) key_gc_unused_keys() which frees key->security and then calls
           keyring_destroy() to unlink the name from the name list
      
       (2) find_keyring_by_name() which calls key_permission(), thus accessing
           key->security, on a key before checking to see whether the key usage is 0
           (ie. the key is dead and might be cleaned up).
      
      Fix this by calling ->destroy() before cleaning up the core key data -
      including key->security.
      Reported-by: default avatarPetr Matousek <pmatouse@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8978f9d0
    • Christoph Hellwig's avatar
      scsi_dh: fix randconfig build error · f56370d4
      Christoph Hellwig authored
      commit 294ab783 upstream.
      
      It looks like the Kconfig check that was meant to fix this (commit
      fe9233fb [SCSI] scsi_dh: fix kconfig related
      build errors) was actually reversed, but no-one noticed until the new set of
      patches which separated DM and SCSI_DH).
      
      Fixes: fe9233fbSigned-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f56370d4
    • Daniel Borkmann's avatar
      netlink, mmap: fix edge-case leakages in nf queue zero-copy · 6a838edc
      Daniel Borkmann authored
      commit 6bb0fef4 upstream.
      
      When netlink mmap on receive side is the consumer of nf queue data,
      it can happen that in some edge cases, we write skb shared info into
      the user space mmap buffer:
      
      Assume a possible rx ring frame size of only 4096, and the network skb,
      which is being zero-copied into the netlink skb, contains page frags
      with an overall skb->len larger than the linear part of the netlink
      skb.
      
      skb_zerocopy(), which is generic and thus not aware of the fact that
      shared info cannot be accessed for such skbs then tries to write and
      fill frags, thus leaking kernel data/pointers and in some corner cases
      possibly writing out of bounds of the mmap area (when filling the
      last slot in the ring buffer this way).
      
      I.e. the ring buffer slot is then of status NL_MMAP_STATUS_VALID, has
      an advertised length larger than 4096, where the linear part is visible
      at the slot beginning, and the leaked sizeof(struct skb_shared_info)
      has been written to the beginning of the next slot (also corrupting
      the struct nl_mmap_hdr slot header incl. status etc), since skb->end
      points to skb->data + ring->frame_size - NL_MMAP_HDRLEN.
      
      The fix adds and lets __netlink_alloc_skb() take the actual needed
      linear room for the network skb + meta data into account. It's completely
      irrelevant for non-mmaped netlink sockets, but in case mmap sockets
      are used, it can be decided whether the available skb_tailroom() is
      really large enough for the buffer, or whether it needs to internally
      fallback to a normal alloc_skb().
      
      >From nf queue side, the information whether the destination port is
      an mmap RX ring is not really available without extra port-to-socket
      lookup, thus it can only be determined in lower layers i.e. when
      __netlink_alloc_skb() is called that checks internally for this. I
      chose to add the extra ldiff parameter as mmap will then still work:
      We have data_len and hlen in nfqnl_build_packet_message(), data_len
      is the full length (capped at queue->copy_range) for skb_zerocopy()
      and hlen some possible part of data_len that needs to be copied; the
      rem_len variable indicates the needed remaining linear mmap space.
      
      The only other workaround in nf queue internally would be after
      allocation time by f.e. cap'ing the data_len to the skb_tailroom()
      iff we deal with an mmap skb, but that would 1) expose the fact that
      we use a mmap skb to upper layers, and 2) trim the skb where we
      otherwise could just have moved the full skb into the normal receive
      queue.
      
      After the patch, in my test case the ring slot doesn't fit and therefore
      shows NL_MMAP_STATUS_COPY, where a full skb carries all the data and
      thus needs to be picked up via recv().
      
      Fixes: 3ab1f683 ("nfnetlink: add support for memory mapped netlink")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6a838edc
    • Daniel Borkmann's avatar
      ebpf: fix fd refcount leaks related to maps in bpf syscall · a7090f77
      Daniel Borkmann authored
      commit 592867bf upstream.
      
      We may already have gotten a proper fd struct through fdget(), so
      whenever we return at the end of an map operation, we need to call
      fdput(). However, each map operation from syscall side first probes
      CHECK_ATTR() to verify that unused fields in the bpf_attr union are
      zero.
      
      In case of malformed input, we return with error, but the lookup to
      the map_fd was already performed at that time, so that we return
      without an corresponding fdput(). Fix it by performing an fdget()
      only right before bpf_map_get(). The fdget() invocation on maps in
      the verifier is not affected.
      
      Fixes: db20fd2b ("bpf: add lookup/update/delete/iterate methods to BPF maps")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a7090f77
    • Eric Dumazet's avatar
      task_work: remove fifo ordering guarantee · 3fbf4c73
      Eric Dumazet authored
      commit c8219906 upstream.
      
      In commit f341861f ("task_work: add a scheduling point in
      task_work_run()") I fixed a latency problem adding a cond_resched()
      call.
      
      Later, commit ac3d0da8 added yet another loop to reverse a list,
      bringing back the latency spike :
      
      I've seen in some cases this loop taking 275 ms, if for example a
      process with 2,000,000 files is killed.
      
      We could add yet another cond_resched() in the reverse loop, or we
      can simply remove the reversal, as I do not think anything
      would depend on order of task_work_add() submitted works.
      
      Fixes: ac3d0da8 ("task_work: Make task_work_add() lockless")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMaciej Żenczykowski <maze@google.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3fbf4c73
    • Axel Lin's avatar
      ASoC: spear_pcm: Use devm_snd_dmaengine_pcm_register to fix resource leak · 2a11e1d5
      Axel Lin authored
      commit 2c3f4b97 upstream.
      
      All the callers assume devm_spear_pcm_platform_register is a devm_ API, so
      use devm_snd_dmaengine_pcm_register in devm_spear_pcm_platform_register.
      
      Fixes: e1771bcf ("ASoC: SPEAr: remove custom DMA alloc compat function")
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      2a11e1d5
    • Scott Feldman's avatar
      bridge: fix netlink max attr size · a515cb6e
      Scott Feldman authored
      commit eb4cb851 upstream.
      
      .maxtype should match .policy.  Probably just been getting lucky here
      because IFLA_BRPORT_MAX > IFLA_BR_MAX.
      
      Fixes: 13323516 ("bridge: implement rtnl_link_ops->changelink")
      Signed-off-by: default avatarScott Feldman <sfeldma@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a515cb6e
    • Florian Fainelli's avatar
      net: bcmgenet: Delay PHY initialization to bcmgenet_open() · 47f3d72e
      Florian Fainelli authored
      commit 6cc8e6d4 upstream.
      
      We are currently doing a full PHY initialization and even starting the
      pHY state machine during bcmgenet_mii_init() which is executed in the
      driver's probe function. This is convenient to determine whether we can
      attach to a proper PHY device but comes at the expense of spending up to
      10ms per MDIO transactions (to reach the waitqueue timeout), which slows
      things down.
      
      This also creates a sitaution where we end-up attaching twice to the
      PHY, which is not quite correct either.
      
      Fix this by moving bcmgenet_mii_probe() into bcmgenet_open() and update
      its error path accordingly.
      
      Avoid printing the message "attached PHY at address 1 [...]" every time
      we bring up/down the interface and remove this print since it duplicates
      what the PHY driver already does for us.
      
      Fixes: 1c1008c7 ("net: bcmgenet: add main driver file")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      47f3d72e
    • Florian Fainelli's avatar
      net: bcmgenet: Use correct dev_id for free_irq · 6629cd48
      Florian Fainelli authored
      commit 978ffac4 upstream.
      
      bcmgenet_open()'s error path call free_irq() with a dev_id argument
      different from the one we used to call request_irq() with, this will
      make us trip over the warning in kernel/irq/manage.c:__free_irq()
      
      Fixes: 1c1008c7 ("net: bcmgenet: add main driver file")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6629cd48
    • Grygorii Strashko's avatar
      pinctrl: single: dra7: remove PCS_QUIRK_SHARED_IRQ · d2e21c80
      Grygorii Strashko authored
      commit 6417049f upstream.
      
      On DRA7 there is one pinctrl domain (dra7_pmx_core) and
      PRCM wake-up IRQ is not shared, so remove quirk.
      
      Cc: Nishanth Menon <nm@ti.com>
      Fixes: 31320bea ('pinctrl: single: Add DRA7 pinctrl compatibility')
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Acked-by: default avatarTero Kristo <t-kristo@ti.com>
      Acked-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d2e21c80
  2. 04 Jan, 2016 8 commits
    • Christophe Lombard's avatar
      cxl: Fix number of allocated pages in SPA · 32190d02
      Christophe Lombard authored
      commit 4108efb0 upstream.
      
      The scheduled process area is currently allocated before assigning the
      correct maximum processes to the AFU, which will mean we only ever
      allocate a fixed number of pages for the scheduled process area. This
      will limit us to 958 processes with 2 x 64K pages. If we try to use more
      processes than that we'd probably overrun the buffer and corrupt memory
      or crash.
      
      AFUs that require three or more interrupts per process will not be
      affected as they are already limited to less processes than that, but we
      could hit it on an AFU that requires 0, 1 or 2 interrupts per process,
      or when using 4K pages.
      
      This patch moves the initialisation of the num_procs to before the SPA
      allocation so that enough pages will be allocated for the number of
      processes that the AFU supports.
      Signed-off-by: default avatarChristophe Lombard <clombard@linux.vnet.ibm.com>
      Signed-off-by: default avatarIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      [ kamal: backport to 4.2-stable: context ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      32190d02
    • Maxim Sheviakov's avatar
      drm/radeon: fix quirk for MSI R7 370 Armor 2X · cd629882
      Maxim Sheviakov authored
      commit 515c752d upstream.
      
      There was a typo in the original.
      
      bug:
      https://bugs.freedesktop.org/show_bug.cgi?id=92865Signed-off-by: default avatarMaxim Sheviakov <mrader3940@yandex.ru>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      cd629882
    • Alex Deucher's avatar
    • Maxim Sheviakov's avatar
      drm/radeon: add quirk for MSI R7 370 · 8217dd18
      Maxim Sheviakov authored
      commit e7865479 upstream.
      
      Just adds the quirk for MSI R7 370 Armor 2X
      Bug:
      https://bugs.freedesktop.org/show_bug.cgi?id=91294Signed-off-by: default avatarMaxim Sheviakov <mrader3940@yandex.ru>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8217dd18
    • Malcolm Crossley's avatar
      x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing map · fe82c1fe
      Malcolm Crossley authored
      commit 64c98e7f upstream.
      
      Sanitizing the e820 map may produce extra E820 entries which would result in
      the topmost E820 entries being removed. The removed entries would typically
      include the top E820 usable RAM region and thus result in the domain having
      signicantly less RAM available to it.
      
      Fix by allowing sanitize_e820_map to use the full size of the allocated E820
      array.
      Signed-off-by: default avatarMalcolm Crossley <malcolm.crossley@citrix.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      [ kamal: backport to 3.19-stable: context ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      fe82c1fe
    • Nikhil Badola's avatar
      drivers: usb: fsl: Workaround for USB erratum-A005275 · 6039455f
      Nikhil Badola authored
      commit f8786a91 upstream.
      
      Incoming packets in high speed are randomly corrupted by h/w
      resulting in multiple errors. This workaround makes FS as
      default mode in all affected socs by disabling HS chirp
      signalling.This errata does not affect FS and LS mode.
      
      Forces all HS devices to connect in FS mode for all socs
      affected by this erratum:
      P3041 and P2041 rev 1.0 and 1.1
      P5020 and P5010 rev 1.0 and 2.0
      P5040, P1010 and T4240 rev 1.0
      Signed-off-by: default avatarRamneek Mehresh <ramneek.mehresh@freescale.com>
      Signed-off-by: default avatarNikhil Badola <nikhil.badola@freescale.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6039455f
    • Nikhil Badola's avatar
      drivers: usb :fsl: Implement Workaround for USB Erratum A007792 · ae14b033
      Nikhil Badola authored
      commit 523f1dec upstream.
      
      USB controller version-2.5 requires to enable internal UTMI
      phy and program PTS field in PORTSC register before asserting
      controller reset. This is must for successful resetting of the
      controller and subsequent enumeration of usb devices
      Signed-off-by: default avatarNikhil Badola <nikhil.badola@freescale.com>
      Signed-off-by: default avatarSuresh Gupta <suresh.gupta@freescale.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ae14b033
    • Eric Benard's avatar
      mxc_nand: fix copy_spare · 45ba1919
      Eric Benard authored
      commit e5a5d92d upstream.
      
      it was broken by 35d5d20e
      "mtd: mxc_nand: cleanup copy_spare function"
      
      else we get the following error :
      [   22.709507] ubi0: attaching mtd3
      [   23.613470] ubi0: scanning is finished
      [   23.617278] ubi0: empty MTD device detected
      [   23.623219] Unhandled fault: imprecise external abort (0x1c06) at 0x9e62f0ec
      [   23.630291] pgd = 9df80000
      [   23.633005] [9e62f0ec] *pgd=8e60041e(bad)
      [   23.637064] Internal error: : 1c06 [#1] SMP ARM
      [   23.641605] Modules linked in:
      [   23.644687] CPU: 0 PID: 99 Comm: ubiattach Not tainted 4.2.0-dirty #22
      [   23.651222] Hardware name: Freescale i.MX53 (Device Tree Support)
      [   23.657322] task: 9e687300 ti: 9dcfc000 task.ti: 9dcfc000
      [   23.662744] PC is at memcpy16_toio+0x4c/0x74
      [   23.667026] LR is at mxc_nand_command+0x484/0x640
      [   23.671739] pc : [<803f9c08>]    lr : [<803faeb0>]    psr: 60000013
      [   23.671739] sp : 9dcfdb10  ip : 9e62f0ea  fp : 9dcfdb1c
      [   23.683222] r10: a09c1000  r9 : 0000001a  r8 : ffffffff
      [   23.688453] r7 : ffffffff  r6 : 9e674810  r5 : 9e674810  r4 : 000000b6
      [   23.694985] r3 : a09c16a4  r2 : a09c16a4  r1 : a09c16a4  r0 : 0000ffff
      [   23.701521] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
      [   23.708662] Control: 10c5387d  Table: 8df80019  DAC: 00000015
      [   23.714413] Process ubiattach (pid: 99, stack limit = 0x9dcfc210)
      [   23.720514] Stack: (0x9dcfdb10 to 0x9dcfe000)
      [   23.724881] db00:                                     9dcfdb6c 9dcfdb20 803faeb0 803f9bc8
      [   23.733069] db20: 803f227c 803f9b74 ffffffff 9e674810 9e674810 9e674810 00000040 9e62f010
      [   23.741255] db40: 803faa2c 9e674b40 9e674810 803faa2c 00000400 803faa2c 00000000 9df42800
      [   23.749441] db60: 9dcfdb9c 9dcfdb70 803f2024 803faa38 9e4201cc 00000000 803f0a78 9e674b40
      [   23.757627] db80: 803f1f80 9e674810 00000400 00000400 9dcfdc14 9dcfdba0 803f3bd8 803f1f8c
      [   23.765814] dba0: 9e4201cc 00000000 00000580 00000000 00000000 800718c0 0000007f 00001000
      [   23.774000] dbc0: 9df42800 000000e0 00000000 00000000 9e4201cc 00000000 00000000 00000000
      [   23.782186] dbe0: 00000580 00000580 00000000 9e674810 9dcfdc20 9dcfdce8 9df42800 00580000
      [   23.790372] dc00: 00000000 00000400 9dcfdc6c 9dcfdc18 803f3f94 803f39a4 9dcfdc20 00000000
      [   23.798558] dc20: 00000000 00000400 00000000 00000000 00000000 00000000 9df42800 00000000
      [   23.806744] dc40: 9dcfdd0c 00580000 00000000 00000400 00000000 9df42800 9dee1000 9d802000
      [   23.814930] dc60: 9dcfdc94 9dcfdc70 803eb63c 803f3f38 00000400 9dcfdce8 9df42800 dead4ead
      [   23.823116] dc80: 803eb5f4 00000000 9dcfdcc4 9dcfdc98 803e82ac 803eb600 00000400 9dcfdce8
      [   23.831301] dca0: 9df42800 00000400 9dee0000 00000000 00000400 00000000 9dcfdd1c 9dcfdcc8
      [   23.839488] dcc0: 80406048 803e8230 00000400 9dcfdce8 9df42800 9dcfdc78 00000008 00000000
      [   23.847673] dce0: 00000000 00000000 00000000 00000004 00000000 9df42800 9dee0000 00000000
      [   23.855859] dd00: 9d802030 00000000 9dc8b214 9d802000 9dcfdd44 9dcfdd20 804066cc 80405f50
      [   23.864047] dd20: 00000400 9dc8b200 9d802030 9df42800 9dee0000 9dc8b200 9dcfdd84 9dcfdd48
      [   23.872233] dd40: 8040a544 804065ac 9e401c80 000080d0 9dcfdd84 00000001 800fc828 9df42400
      [   23.880418] dd60: 00000000 00000080 9dc8b200 9dc8b200 9dc8b200 9dee0000 9dcfdddc 9dcfdd88
      [   23.888605] dd80: 803fb560 8040a440 9dcfddc4 9dcfdd98 800f1428 9dee1000 a0acf000 00000000
      [   23.896792] dda0: 00000000 ffffffff 00000006 00000000 9dee0000 9dee0000 00005600 00000080
      [   23.904979] ddc0: 9dc8b200 a0acf000 9dc8b200 8112514c 9dcfde24 9dcfdde0 803fc08c 803fb4f0
      [   23.913165] dde0: 9e401c80 00000013 9dcfde04 9dcfddf8 8006bbf8 8006ba00 9dcfde24 00000000
      [   23.921351] de00: 9dee0000 00000065 9dee0000 00000001 9dc8b200 8112514c 9dcfde84 9dcfde28
      [   23.929538] de20: 8040afa0 803fb948 ffffffff 00000000 9dc8b214 9dcfde40 800f1428 800f11dc
      [   23.937724] de40: 9dc8b21c 9dc8b20c 9dc8b204 9dee1000 9dc8b214 8069bb60 fffff000 fffff000
      [   23.945911] de60: 9e7b5400 00000000 9dee0000 9dee1000 00001000 9e7b5400 9dcfdecc 9dcfde88
      [   23.954097] de80: 803ff1bc 8040a630 9dcfdea4 9dcfde98 00000800 00000800 9dcfdecc 9dcfdea8
      [   23.962284] dea0: 803e8f6c 00000000 7e87ab70 9e7b5400 80113e30 00000003 9dcfc000 00000000
      [   23.970470] dec0: 9dcfdf04 9dcfded0 804008cc 803feb98 ffffffff 00000003 00000000 00000000
      [   23.978656] dee0: 00000000 00000000 9e7cb000 9dc193e0 7e87ab70 9dd92140 9dcfdf7c 9dcfdf08
      [   23.986842] df00: 80113b5c 8040080c 800fbed8 8006bbf0 9e7cb000 00000003 9e7cb000 9dd92140
      [   23.995029] df20: 9dc193e0 9dd92148 9dcfdf4c 9dcfdf38 8011022c 800fbe78 8000f9cc 9e687300
      [   24.003216] df40: 9dcfdf6c 9dcfdf50 8011f798 8007ffe8 7e87ab70 9dd92140 00000003 9dd92140
      [   24.011402] df60: 40186f40 7e87ab70 9dcfc000 00000000 9dcfdfa4 9dcfdf80 80113e30 8011373c
      [   24.019588] df80: 7e87ab70 7e87ab70 7e87aea9 00000036 8000fb84 9dcfc000 00000000 9dcfdfa8
      [   24.027775] dfa0: 8000f9a0 80113e00 7e87ab70 7e87ab70 00000003 40186f40 7e87ab70 00000000
      [   24.035962] dfc0: 7e87ab70 7e87ab70 7e87aea9 00000036 00000000 00000000 76fd1f70 00000000
      [   24.044148] dfe0: 76f80f8c 7e87ab28 00009810 76f80fc4 60000010 00000003 00000000 00000000
      [   24.052328] Backtrace:
      [   24.054806] [<803f9bbc>] (memcpy16_toio) from [<803faeb0>] (mxc_nand_command+0x484/0x640)
      [   24.062996] [<803faa2c>] (mxc_nand_command) from [<803f2024>] (nand_write_page+0xa4/0x154)
      [   24.071264]  r10:9df42800 r9:00000000 r8:803faa2c r7:00000400 r6:803faa2c r5:9e674810
      [   24.079180]  r4:9e674b40
      [   24.081738] [<803f1f80>] (nand_write_page) from [<803f3bd8>] (nand_do_write_ops+0x240/0x444)
      [   24.090180]  r8:00000400 r7:00000400 r6:9e674810 r5:803f1f80 r4:9e674b40
      [   24.096970] [<803f3998>] (nand_do_write_ops) from [<803f3f94>] (nand_write+0x68/0x88)
      [   24.104804]  r10:00000400 r9:00000000 r8:00580000 r7:9df42800 r6:9dcfdce8 r5:9dcfdc20
      [   24.112719]  r4:9e674810
      [   24.115287] [<803f3f2c>] (nand_write) from [<803eb63c>] (part_write+0x48/0x50)
      [   24.122514]  r10:9d802000 r9:9dee1000 r8:9df42800 r7:00000000 r6:00000400 r5:00000000
      [   24.130429]  r4:00580000
      [   24.132989] [<803eb5f4>] (part_write) from [<803e82ac>] (mtd_write+0x88/0xa0)
      [   24.140129]  r5:00000000 r4:803eb5f4
      [   24.143748] [<803e8224>] (mtd_write) from [<80406048>] (ubi_io_write+0x104/0x65c)
      [   24.151235]  r7:00000000 r6:00000400 r5:00000000 r4:9dee0000
      [   24.156968] [<80405f44>] (ubi_io_write) from [<804066cc>] (ubi_io_write_ec_hdr+0x12c/0x190)
      [   24.165323]  r10:9d802000 r9:9dc8b214 r8:00000000 r7:9d802030 r6:00000000 r5:9dee0000
      [   24.173239]  r4:9df42800
      [   24.175798] [<804065a0>] (ubi_io_write_ec_hdr) from [<8040a544>] (ubi_early_get_peb+0x110/0x1f0)
      [   24.184587]  r6:9dc8b200 r5:9dee0000 r4:9df42800
      [   24.189262] [<8040a434>] (ubi_early_get_peb) from [<803fb560>] (create_vtbl+0x7c/0x238)
      [   24.197271]  r10:9dee0000 r9:9dc8b200 r8:9dc8b200 r7:9dc8b200 r6:00000080 r5:00000000
      [   24.205187]  r4:9df42400
      [   24.207746] [<803fb4e4>] (create_vtbl) from [<803fc08c>] (ubi_read_volume_table+0x750/0xa64)
      [   24.216187]  r10:8112514c r9:9dc8b200 r8:a0acf000 r7:9dc8b200 r6:00000080 r5:00005600
      [   24.224103]  r4:9dee0000
      [   24.226662] [<803fb93c>] (ubi_read_volume_table) from [<8040afa0>] (ubi_attach+0x97c/0x152c)
      [   24.235103]  r10:8112514c r9:9dc8b200 r8:00000001 r7:9dee0000 r6:00000065 r5:9dee0000
      [   24.243018]  r4:00000000
      [   24.245579] [<8040a624>] (ubi_attach) from [<803ff1bc>] (ubi_attach_mtd_dev+0x630/0xbac)
      [   24.253673]  r10:9e7b5400 r9:00001000 r8:9dee1000 r7:9dee0000 r6:00000000 r5:9e7b5400
      [   24.261588]  r4:fffff000
      [   24.264148] [<803feb8c>] (ubi_attach_mtd_dev) from [<804008cc>] (ctrl_cdev_ioctl+0xcc/0x1cc)
      [   24.272589]  r10:00000000 r9:9dcfc000 r8:00000003 r7:80113e30 r6:9e7b5400 r5:7e87ab70
      [   24.280505]  r4:00000000
      [   24.283070] [<80400800>] (ctrl_cdev_ioctl) from [<80113b5c>] (do_vfs_ioctl+0x42c/0x6c4)
      [   24.291077]  r6:9dd92140 r5:7e87ab70 r4:9dc193e0
      [   24.295753] [<80113730>] (do_vfs_ioctl) from [<80113e30>] (SyS_ioctl+0x3c/0x64)
      [   24.303066]  r10:00000000 r9:9dcfc000 r8:7e87ab70 r7:40186f40 r6:9dd92140 r5:00000003
      [   24.310981]  r4:9dd92140
      [   24.313549] [<80113df4>] (SyS_ioctl) from [<8000f9a0>] (ret_fast_syscall+0x0/0x54)
      [   24.321123]  r9:9dcfc000 r8:8000fb84 r7:00000036 r6:7e87aea9 r5:7e87ab70 r4:7e87ab70
      [   24.328957] Code: e1c300b0 e1510002 e1a03001 1afffff9 (e89da800)
      [   24.335066] ---[ end trace ab1cb17887f21bbb ]---
      [   24.340249] Unhandled fault: imprecise external abort (0x1c06) at 0x7ee8bcf0
      [   24.347310] pgd = 9df3c000
      [   24.350023] [7ee8bcf0] *pgd=8dcbf831, *pte=8eb3334f, *ppte=8eb3383f
      Segmentation fault
      
      Fixes: 35d5d20e ("mtd: mxc_nand: cleanup copy_spare function")
      Signed-off-by: default avatarEric Bénard <eric@eukrea.com>
      Reviewed-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Reviewed-by: default avatarBaruch Siach <baruch@tkos.co.il>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      45ba1919
  3. 15 Dec, 2015 20 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.2.8 · 5f5c1db0
      Greg Kroah-Hartman authored
      5f5c1db0
    • Filipe Manana's avatar
      Btrfs: fix regression running delayed references when using qgroups · dacd97b8
      Filipe Manana authored
      commit b06c4bf5 upstream.
      
      In the kernel 4.2 merge window we had a big changes to the implementation
      of delayed references and qgroups which made the no_quota field of delayed
      references not used anymore. More specifically the no_quota field is not
      used anymore as of:
      
        commit 0ed4792a ("btrfs: qgroup: Switch to new extent-oriented qgroup mechanism.")
      
      Leaving the no_quota field actually prevents delayed references from
      getting merged, which in turn cause the following BUG_ON(), at
      fs/btrfs/extent-tree.c, to be hit when qgroups are enabled:
      
        static int run_delayed_tree_ref(...)
        {
           (...)
           BUG_ON(node->ref_mod != 1);
           (...)
        }
      
      This happens on a scenario like the following:
      
        1) Ref1 bytenr X, action = BTRFS_ADD_DELAYED_REF, no_quota = 1, added.
      
        2) Ref2 bytenr X, action = BTRFS_DROP_DELAYED_REF, no_quota = 0, added.
           It's not merged with Ref1 because Ref1->no_quota != Ref2->no_quota.
      
        3) Ref3 bytenr X, action = BTRFS_ADD_DELAYED_REF, no_quota = 1, added.
           It's not merged with the reference at the tail of the list of refs
           for bytenr X because the reference at the tail, Ref2 is incompatible
           due to Ref2->no_quota != Ref3->no_quota.
      
        4) Ref4 bytenr X, action = BTRFS_DROP_DELAYED_REF, no_quota = 0, added.
           It's not merged with the reference at the tail of the list of refs
           for bytenr X because the reference at the tail, Ref3 is incompatible
           due to Ref3->no_quota != Ref4->no_quota.
      
        5) We run delayed references, trigger merging of delayed references,
           through __btrfs_run_delayed_refs() -> btrfs_merge_delayed_refs().
      
        6) Ref1 and Ref3 are merged as Ref1->no_quota = Ref3->no_quota and
           all other conditions are satisfied too. So Ref1 gets a ref_mod
           value of 2.
      
        7) Ref2 and Ref4 are merged as Ref2->no_quota = Ref4->no_quota and
           all other conditions are satisfied too. So Ref2 gets a ref_mod
           value of 2.
      
        8) Ref1 and Ref2 aren't merged, because they have different values
           for their no_quota field.
      
        9) Delayed reference Ref1 is picked for running (select_delayed_ref()
           always prefers references with an action == BTRFS_ADD_DELAYED_REF).
           So run_delayed_tree_ref() is called for Ref1 which triggers the
           BUG_ON because Ref1->red_mod != 1 (equals 2).
      
      So fix this by removing the no_quota field, as it's not used anymore as
      of commit 0ed4792a ("btrfs: qgroup: Switch to new extent-oriented
      qgroup mechanism.").
      
      The use of no_quota was also buggy in at least two places:
      
      1) At delayed-refs.c:btrfs_add_delayed_tree_ref() - we were setting
         no_quota to 0 instead of 1 when the following condition was true:
         is_fstree(ref_root) || !fs_info->quota_enabled
      
      2) At extent-tree.c:__btrfs_inc_extent_ref() - we were attempting to
         reset a node's no_quota when the condition "!is_fstree(root_objectid)
         || !root->fs_info->quota_enabled" was true but we did it only in
         an unused local stack variable, that is, we never reset the no_quota
         value in the node itself.
      
      This fixes the remainder of problems several people have been having when
      running delayed references, mostly while a balance is running in parallel,
      on a 4.2+ kernel.
      
      Very special thanks to Stéphane Lesimple for helping debugging this issue
      and testing this fix on his multi terabyte filesystem (which took more
      than one day to balance alone, plus fsck, etc).
      
      Also, this fixes deadlock issue when using the clone ioctl with qgroups
      enabled, as reported by Elias Probst in the mailing list. The deadlock
      happens because after calling btrfs_insert_empty_item we have our path
      holding a write lock on a leaf of the fs/subvol tree and then before
      releasing the path we called check_ref() which did backref walking, when
      qgroups are enabled, and tried to read lock the same leaf. The trace for
      this case is the following:
      
        INFO: task systemd-nspawn:6095 blocked for more than 120 seconds.
        (...)
        Call Trace:
          [<ffffffff86999201>] schedule+0x74/0x83
          [<ffffffff863ef64c>] btrfs_tree_read_lock+0xc0/0xea
          [<ffffffff86137ed7>] ? wait_woken+0x74/0x74
          [<ffffffff8639f0a7>] btrfs_search_old_slot+0x51a/0x810
          [<ffffffff863a129b>] btrfs_next_old_leaf+0xdf/0x3ce
          [<ffffffff86413a00>] ? ulist_add_merge+0x1b/0x127
          [<ffffffff86411688>] __resolve_indirect_refs+0x62a/0x667
          [<ffffffff863ef546>] ? btrfs_clear_lock_blocking_rw+0x78/0xbe
          [<ffffffff864122d3>] find_parent_nodes+0xaf3/0xfc6
          [<ffffffff86412838>] __btrfs_find_all_roots+0x92/0xf0
          [<ffffffff864128f2>] btrfs_find_all_roots+0x45/0x65
          [<ffffffff8639a75b>] ? btrfs_get_tree_mod_seq+0x2b/0x88
          [<ffffffff863e852e>] check_ref+0x64/0xc4
          [<ffffffff863e9e01>] btrfs_clone+0x66e/0xb5d
          [<ffffffff863ea77f>] btrfs_ioctl_clone+0x48f/0x5bb
          [<ffffffff86048a68>] ? native_sched_clock+0x28/0x77
          [<ffffffff863ed9b0>] btrfs_ioctl+0xabc/0x25cb
        (...)
      
      The problem goes away by eleminating check_ref(), which no longer is
      needed as its purpose was to get a value for the no_quota field of
      a delayed reference (this patch removes the no_quota field as mentioned
      earlier).
      Reported-by: default avatarStéphane Lesimple <stephane_btrfs@lesimple.fr>
      Tested-by: default avatarStéphane Lesimple <stephane_btrfs@lesimple.fr>
      Reported-by: default avatarElias Probst <mail@eliasprobst.eu>
      Reported-by: default avatarPeter Becker <floyd.net@gmail.com>
      Reported-by: default avatarMalte Schröder <malte@tnxip.de>
      Reported-by: default avatarDerek Dongray <derek@valedon.co.uk>
      Reported-by: default avatarErkki Seppala <flux-btrfs@inside.org>
      Cc: stable@vger.kernel.org  # 4.2+
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarQu Wenruo <quwenruo@cn.fujitsu.com>
      dacd97b8
    • Hans Verkuil's avatar
      cobalt: fix Kconfig dependency · 25c386d0
      Hans Verkuil authored
      commit fc88dd16 upstream.
      
      The cobalt driver should depend on VIDEO_V4L2_SUBDEV_API.
      
      This fixes this kbuild error:
      
      tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
      head:   99bc7215
      commit: 85756a06 [media] cobalt: add new driver
      config: x86_64-randconfig-s0-09201514 (attached as .config)
      reproduce:
        git checkout 85756a06
        # save the attached .config to linux build tree
        make ARCH=x86_64
      
      All error/warnings (new ones prefixed by >>):
      
         drivers/media/i2c/adv7604.c: In function 'adv76xx_get_format':
      >> drivers/media/i2c/adv7604.c:1853:9: error: implicit declaration of function 'v4l2_subdev_get_try_format' [-Werror=implicit-function-declaration]
            fmt = v4l2_subdev_get_try_format(sd, cfg, format->pad);
                  ^
         drivers/media/i2c/adv7604.c:1853:7: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
            fmt = v4l2_subdev_get_try_format(sd, cfg, format->pad);
                ^
         drivers/media/i2c/adv7604.c: In function 'adv76xx_set_format':
         drivers/media/i2c/adv7604.c:1882:7: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
            fmt = v4l2_subdev_get_try_format(sd, cfg, format->pad);
                ^
         cc1: some warnings being treated as errors
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      25c386d0
    • Lu, Han's avatar
      ALSA: hda/hdmi - apply Skylake fix-ups to Broxton display codec · bc506f02
      Lu, Han authored
      commit e2656412 upstream.
      
      Broxton and Skylake have the same behavior on display audio. So this patch
      applys Skylake fix-ups to Broxton.
      Signed-off-by: default avatarLu, Han <han.lu@intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc506f02
    • Arnd Bergmann's avatar
      ceph: fix message length computation · 3dbb4105
      Arnd Bergmann authored
      commit 777d738a upstream.
      
      create_request_message() computes the maximum length of a message,
      but uses the wrong type for the time stamp: sizeof(struct timespec)
      may be 8 or 16 depending on the architecture, while sizeof(struct
      ceph_timespec) is always 8, and that is what gets put into the
      message.
      
      Found while auditing the uses of timespec for y2038 problems.
      
      Fixes: b8e69066 ("ceph: include time stamp in every MDS request")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarYan, Zheng <zyan@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3dbb4105
    • Junxiao Bi's avatar
      ocfs2: fix umask ignored issue · b91750f8
      Junxiao Bi authored
      commit 8f1eb487 upstream.
      
      New created file's mode is not masked with umask, and this makes umask not
      work for ocfs2 volume.
      
      Fixes: 702e5bc6 ("ocfs2: use generic posix ACL infrastructure")
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Gang He <ghe@suse.com>
      Cc: Mark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b91750f8
    • Jeff Layton's avatar
      nfs: if we have no valid attrs, then don't declare the attribute cache valid · 12cdf727
      Jeff Layton authored
      commit c812012f upstream.
      
      If we pass in an empty nfs_fattr struct to nfs_update_inode, it will
      (correctly) not update any of the attributes, but it then clears the
      NFS_INO_INVALID_ATTR flag, which indicates that the attributes are
      up to date. Don't clear the flag if the fattr struct has no valid
      attrs to apply.
      Reviewed-by: default avatarSteve French <steve.french@primarydata.com>
      Signed-off-by: default avatarJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12cdf727
    • Benjamin Coddington's avatar
      nfs4: start callback_ident at idr 1 · 1fa68c4c
      Benjamin Coddington authored
      commit c68a027c upstream.
      
      If clp->cl_cb_ident is zero, then nfs_cb_idr_remove_locked() skips removing
      it when the nfs_client is freed.  A decoding or server bug can then find
      and try to put that first nfs_client which would lead to a crash.
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Fixes: d6870312 ("nfs4client: convert to idr_alloc()")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fa68c4c
    • Daniel Borkmann's avatar
      debugfs: fix refcount imbalance in start_creating · a1936414
      Daniel Borkmann authored
      commit 0ee9608c upstream.
      
      In debugfs' start_creating(), we pin the file system to safely access
      its root. When we failed to create a file, we unpin the file system via
      failed_creating() to release the mount count and eventually the reference
      of the vfsmount.
      
      However, when we run into an error during lookup_one_len() when still
      in start_creating(), we only release the parent's mutex but not so the
      reference on the mount. Looks like it was done in the past, but after
      splitting portions of __create_file() into start_creating() and
      end_creating() via 190afd81 ("debugfs: split the beginning and the
      end of __create_file() off"), this seemed missed. Noticed during code
      review.
      
      Fixes: 190afd81 ("debugfs: split the beginning and the end of __create_file() off")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a1936414
    • Andrew Elble's avatar
      nfsd: eliminate sending duplicate and repeated delegations · 814a24d5
      Andrew Elble authored
      commit 34ed9872 upstream.
      
      We've observed the nfsd server in a state where there are
      multiple delegations on the same nfs4_file for the same client.
      The nfs client does attempt to DELEGRETURN these when they are presented to
      it - but apparently under some (unknown) circumstances the client does not
      manage to return all of them. This leads to the eventual
      attempt to CB_RECALL more than one delegation with the same nfs
      filehandle to the same client. The first recall will succeed, but the
      next recall will fail with NFS4ERR_BADHANDLE. This leads to the server
      having delegations on cl_revoked that the client has no way to FREE
      or DELEGRETURN, with resulting inability to recover. The state manager
      on the server will continually assert SEQ4_STATUS_RECALLABLE_STATE_REVOKED,
      and the state manager on the client will be looping unable to satisfy
      the server.
      
      List discussion also reports a race between OPEN and DELEGRETURN that
      will be avoided by only sending the delegation once to the
      client. This is also logically in accordance with RFC5561 9.1.1 and 10.2.
      
      So, let's:
      
      1.) Not hand out duplicate delegations.
      2.) Only send them to the client once.
      
      RFC 5561:
      
      9.1.1:
      "Delegations and layouts, on the other hand, are not associated with a
      specific owner but are associated with the client as a whole
      (identified by a client ID)."
      
      10.2:
      "...the stateid for a delegation is associated with a client ID and may be
      used on behalf of all the open-owners for the given client.  A
      delegation is made to the client as a whole and not to any specific
      process or thread of control within it."
      Reported-by: default avatarEric Meddaugh <etmsys@rit.edu>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Cc: Olga Kornievskaia <aglo@umich.edu>
      Signed-off-by: default avatarAndrew Elble <aweits@rit.edu>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      814a24d5
    • Jeff Layton's avatar
      nfsd: serialize state seqid morphing operations · 4708c200
      Jeff Layton authored
      commit 35a92fe8 upstream.
      
      Andrew was seeing a race occur when an OPEN and OPEN_DOWNGRADE were
      running in parallel. The server would receive the OPEN_DOWNGRADE first
      and check its seqid, but then an OPEN would race in and bump it. The
      OPEN_DOWNGRADE would then complete and bump the seqid again.  The result
      was that the OPEN_DOWNGRADE would be applied after the OPEN, even though
      it should have been rejected since the seqid changed.
      
      The only recourse we have here I think is to serialize operations that
      bump the seqid in a stateid, particularly when we're given a seqid in
      the call. To address this, we add a new rw_semaphore to the
      nfs4_ol_stateid struct. We do a down_write prior to checking the seqid
      after looking up the stateid to ensure that nothing else is going to
      bump it while we're operating on it.
      
      In the case of OPEN, we do a down_read, as the call doesn't contain a
      seqid. Those can run in parallel -- we just need to serialize them when
      there is a concurrent OPEN_DOWNGRADE or CLOSE.
      
      LOCK and LOCKU however always take the write lock as there is no
      opportunity for parallelizing those.
      Reported-and-Tested-by: default avatarAndrew W Elble <aweits@rit.edu>
      Signed-off-by: default avatarJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4708c200
    • Stefan Richter's avatar
      firewire: ohci: fix JMicron JMB38x IT context discovery · 080fe15d
      Stefan Richter authored
      commit 100ceb66 upstream.
      
      Reported by Clifford and Craig for JMicron OHCI-1394 + SDHCI combo
      controllers:  Often or even most of the time, the controller is
      initialized with the message "added OHCI v1.10 device as card 0, 4 IR +
      0 IT contexts, quirks 0x10".  With 0 isochronous transmit DMA contexts
      (IT contexts), applications like audio output are impossible.
      
      However, OHCI-1394 demands that at least 4 IT contexts are implemented
      by the link layer controller, and indeed JMicron JMB38x do implement
      four of them.  Only their IsoXmitIntMask register is unreliable at early
      access.
      
      With my own JMB381 single function controller I found:
        - I can reproduce the problem with a lower probability than Craig's.
        - If I put a loop around the section which clears and reads
          IsoXmitIntMask, then either the first or the second attempt will
          return the correct initial mask of 0x0000000f.  I never encountered
          a case of needing more than a second attempt.
        - Consequently, if I put a dummy reg_read(...IsoXmitIntMaskSet)
          before the first write, the subsequent read will return the correct
          result.
        - If I merely ignore a wrong read result and force the known real
          result, later isochronous transmit DMA usage works just fine.
      
      So let's just fix this chip bug up by the latter method.  Tested with
      JMB381 on kernel 3.13 and 4.3.
      
      Since OHCI-1394 generally requires 4 IT contexts at a minium, this
      workaround is simply applied whenever the initial read of IsoXmitIntMask
      returns 0, regardless whether it's a JMicron chip or not.  I never heard
      of this issue together with any other chip though.
      
      I am not 100% sure that this fix works on the OHCI-1394 part of JMB380
      and JMB388 combo controllers exactly the same as on the JMB381 single-
      function controller, but so far I haven't had a chance to let an owner
      of a combo chip run a patched kernel.
      
      Strangely enough, IsoRecvIntMask is always reported correctly, even
      though it is probed right before IsoXmitIntMask.
      
      Reported-by: Clifford Dunn
      Reported-by: default avatarCraig Moore <craig.moore@qenos.com>
      Signed-off-by: default avatarStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      080fe15d
    • Daeho Jeong's avatar
      ext4, jbd2: ensure entering into panic after recording an error in superblock · 3be55388
      Daeho Jeong authored
      commit 4327ba52 upstream.
      
      If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
      journaling will be aborted first and the error number will be recorded
      into JBD2 superblock and, finally, the system will enter into the
      panic state in "errors=panic" option.  But, in the rare case, this
      sequence is little twisted like the below figure and it will happen
      that the system enters into panic state, which means the system reset
      in mobile environment, before completion of recording an error in the
      journal superblock. In this case, e2fsck cannot recognize that the
      filesystem failure occurred in the previous run and the corruption
      wouldn't be fixed.
      
      Task A                        Task B
      ext4_handle_error()
      -> jbd2_journal_abort()
        -> __journal_abort_soft()
          -> __jbd2_journal_abort_hard()
          | -> journal->j_flags |= JBD2_ABORT;
          |
          |                         __ext4_abort()
          |                         -> jbd2_journal_abort()
          |                         | -> __journal_abort_soft()
          |                         |   -> if (journal->j_flags & JBD2_ABORT)
          |                         |           return;
          |                         -> panic()
          |
          -> jbd2_journal_update_sb_errno()
      Tested-by: default avatarHobin Woo <hobin.woo@samsung.com>
      Signed-off-by: default avatarDaeho Jeong <daeho.jeong@samsung.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3be55388
    • Lukas Czerner's avatar
      ext4: fix potential use after free in __ext4_journal_stop · e107e557
      Lukas Czerner authored
      commit 6934da92 upstream.
      
      There is a use-after-free possibility in __ext4_journal_stop() in the
      case that we free the handle in the first jbd2_journal_stop() because
      we're referencing handle->h_err afterwards. This was introduced in
      9705acd6 and it is wrong. Fix it by
      storing the handle->h_err value beforehand and avoid referencing
      potentially freed handle.
      
      Fixes: 9705acd6Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e107e557
    • Theodore Ts'o's avatar
      ext4 crypto: replace some BUG_ON()'s with error checks · 89fa6ced
      Theodore Ts'o authored
      commit 687c3c36 upstream.
      
      Buggy (or hostile) userspace should not be able to cause the kernel to
      crash.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89fa6ced
    • Theodore Ts'o's avatar
      ext4 crypto: fix memory leak in ext4_bio_write_page() · 34a6ac6a
      Theodore Ts'o authored
      commit 937d7b84 upstream.
      
      There are times when ext4_bio_write_page() is called even though we
      don't actually need to do any I/O.  This happens when ext4_writepage()
      gets called by the jbd2 commit path when an inode needs to force its
      pages written out in order to provide data=ordered guarantees --- and
      a page is backed by an unwritten (e.g., uninitialized) block on disk,
      or if delayed allocation means the page's backing store hasn't been
      allocated yet.  In that case, we need to skip the call to
      ext4_encrypt_page(), since in addition to wasting CPU, it leads to a
      bounce page and an ext4 crypto context getting leaked.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34a6ac6a
    • Ilya Dryomov's avatar
      rbd: don't put snap_context twice in rbd_queue_workfn() · 587de47c
      Ilya Dryomov authored
      commit 70b16db8 upstream.
      
      Commit 4e752f0a ("rbd: access snapshot context and mapping size
      safely") moved ceph_get_snap_context() out of rbd_img_request_create()
      and into rbd_queue_workfn(), adding a ceph_put_snap_context() to the
      error path in rbd_queue_workfn().  However, rbd_img_request_create()
      consumes a ref on snapc, so calling ceph_put_snap_context() after
      a successful rbd_img_request_create() leads to an extra put.  Fix it.
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarJosh Durgin <jdurgin@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      587de47c
    • David Sterba's avatar
      btrfs: fix signed overflows in btrfs_sync_file · cf7c570b
      David Sterba authored
      commit 9dcbeed4 upstream.
      
      The calculation of range length in btrfs_sync_file leads to signed
      overflow. This was caught by PaX gcc SIZE_OVERFLOW plugin.
      
      https://forums.grsecurity.net/viewtopic.php?f=1&t=4284
      
      The fsync call passes 0 and LLONG_MAX, the range length does not fit to
      loff_t and overflows, but the value is converted to u64 so it silently
      works as expected.
      
      The minimal fix is a typecast to u64, switching functions to take
      (start, end) instead of (start, len) would be more intrusive.
      
      Coccinelle script found that there's one more opencoded calculation of
      the length.
      
      <smpl>
      @@
      loff_t start, end;
      @@
      * end - start
      </smpl>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf7c570b
    • Filipe Manana's avatar
      Btrfs: fix race when listing an inode's xattrs · d3b189c3
      Filipe Manana authored
      commit f1cd1f0b upstream.
      
      When listing a inode's xattrs we have a time window where we race against
      a concurrent operation for adding a new hard link for our inode that makes
      us not return any xattr to user space. In order for this to happen, the
      first xattr of our inode needs to be at slot 0 of a leaf and the previous
      leaf must still have room for an inode ref (or extref) item, and this can
      happen because an inode's listxattrs callback does not lock the inode's
      i_mutex (nor does the VFS does it for us), but adding a hard link to an
      inode makes the VFS lock the inode's i_mutex before calling the inode's
      link callback.
      
      If we have the following leafs:
      
                     Leaf X (has N items)                    Leaf Y
      
       [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ]  [ (257 XATTR_ITEM 12345), ... ]
                 slot N - 2         slot N - 1              slot 0
      
      The race illustrated by the following sequence diagram is possible:
      
             CPU 1                                               CPU 2
      
        btrfs_listxattr()
      
          searches for key (257 XATTR_ITEM 0)
      
          gets path with path->nodes[0] == leaf X
          and path->slots[0] == N
      
          because path->slots[0] is >=
          btrfs_header_nritems(leaf X), it calls
          btrfs_next_leaf()
      
          btrfs_next_leaf()
            releases the path
      
                                                         adds key (257 INODE_REF 666)
                                                         to the end of leaf X (slot N),
                                                         and leaf X now has N + 1 items
      
            searches for the key (257 INODE_REF 256),
            with path->keep_locks == 1, because that
            is the last key it saw in leaf X before
            releasing the path
      
            ends up at leaf X again and it verifies
            that the key (257 INODE_REF 256) is no
            longer the last key in leaf X, so it
            returns with path->nodes[0] == leaf X
            and path->slots[0] == N, pointing to
            the new item with key (257 INODE_REF 666)
      
          btrfs_listxattr's loop iteration sees that
          the type of the key pointed by the path is
          different from the type BTRFS_XATTR_ITEM_KEY
          and so it breaks the loop and stops looking
          for more xattr items
            --> the application doesn't get any xattr
                listed for our inode
      
      So fix this by breaking the loop only if the key's type is greater than
      BTRFS_XATTR_ITEM_KEY and skip the current key if its type is smaller.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d3b189c3
    • Filipe Manana's avatar
      Btrfs: fix race leading to BUG_ON when running delalloc for nodatacow · 3e25e3a0
      Filipe Manana authored
      commit 1d512cb7 upstream.
      
      If we are using the NO_HOLES feature, we have a tiny time window when
      running delalloc for a nodatacow inode where we can race with a concurrent
      link or xattr add operation leading to a BUG_ON.
      
      This happens because at run_delalloc_nocow() we end up casting a leaf item
      of type BTRFS_INODE_[REF|EXTREF]_KEY or of type BTRFS_XATTR_ITEM_KEY to a
      file extent item (struct btrfs_file_extent_item) and then analyse its
      extent type field, which won't match any of the expected extent types
      (values BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]) and therefore trigger an
      explicit BUG_ON(1).
      
      The following sequence diagram shows how the race happens when running a
      no-cow dellaloc range [4K, 8K[ for inode 257 and we have the following
      neighbour leafs:
      
                   Leaf X (has N items)                    Leaf Y
      
       [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ]  [ (257 EXTENT_DATA 8192), ... ]
                    slot N - 2         slot N - 1              slot 0
      
       (Note the implicit hole for inode 257 regarding the [0, 8K[ range)
      
             CPU 1                                         CPU 2
      
       run_dealloc_nocow()
         btrfs_lookup_file_extent()
           --> searches for a key with value
               (257 EXTENT_DATA 4096) in the
               fs/subvol tree
           --> returns us a path with
               path->nodes[0] == leaf X and
               path->slots[0] == N
      
         because path->slots[0] is >=
         btrfs_header_nritems(leaf X), it
         calls btrfs_next_leaf()
      
         btrfs_next_leaf()
           --> releases the path
      
                                                    hard link added to our inode,
                                                    with key (257 INODE_REF 500)
                                                    added to the end of leaf X,
                                                    so leaf X now has N + 1 keys
      
           --> searches for the key
               (257 INODE_REF 256), because
               it was the last key in leaf X
               before it released the path,
               with path->keep_locks set to 1
      
           --> ends up at leaf X again and
               it verifies that the key
               (257 INODE_REF 256) is no longer
               the last key in the leaf, so it
               returns with path->nodes[0] ==
               leaf X and path->slots[0] == N,
               pointing to the new item with
               key (257 INODE_REF 500)
      
         the loop iteration of run_dealloc_nocow()
         does not break out the loop and continues
         because the key referenced in the path
         at path->nodes[0] and path->slots[0] is
         for inode 257, its type is < BTRFS_EXTENT_DATA_KEY
         and its offset (500) is less then our delalloc
         range's end (8192)
      
         the item pointed by the path, an inode reference item,
         is (incorrectly) interpreted as a file extent item and
         we get an invalid extent type, leading to the BUG_ON(1):
      
         if (extent_type == BTRFS_FILE_EXTENT_REG ||
            extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
             (...)
         } else if (extent_type == BTRFS_FILE_EXTENT_INLINE) {
             (...)
         } else {
             BUG_ON(1)
         }
      
      The same can happen if a xattr is added concurrently and ends up having
      a key with an offset smaller then the delalloc's range end.
      
      So fix this by skipping keys with a type smaller than
      BTRFS_EXTENT_DATA_KEY.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e25e3a0