1. 27 Aug, 2016 13 commits
    • Yishai Hadas's avatar
      IB/mlx4: Fix the SQ size of an RC QP · 9a7edde8
      Yishai Hadas authored
      commit f2940e2c upstream.
      
      When calculating the required size of an RC QP send queue, leave
      enough space for masked atomic operations, which require more space than
      "regular" atomic operation.
      
      Fixes: 6fa8f719 ("IB/mlx4: Add support for masked atomic operations")
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Reviewed-by: default avatarJack Morgenstein <jackm@mellanox.co.il>
      Reviewed-by: default avatarEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      9a7edde8
    • Erez Shitrit's avatar
      IB/IPoIB: Don't update neigh validity for unresolved entries · 0c3c7f4c
      Erez Shitrit authored
      commit 61c78eea upstream.
      
      ipoib_neigh_get unconditionally updates the "alive" variable member on
      any packet send.  This prevents the neighbor garbage collection from
      cleaning out a dead neighbor entry if we are still queueing packets
      for it.  If the queue for this neighbor is full, then don't update the
      alive timestamp.  That way the neighbor can time out even if packets
      are still being queued as long as none of them are being sent.
      
      Fixes: b63b70d8 ("IPoIB: Use a private hash table for path lookup in xmit path")
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0c3c7f4c
    • Jason Gunthorpe's avatar
      IB/security: Restrict use of the write() interface · 55ac348c
      Jason Gunthorpe authored
      commit e6bd18f5 upstream.
      
      The drivers/infiniband stack uses write() as a replacement for
      bi-directional ioctl().  This is not safe. There are ways to
      trigger write calls that result in the return structure that
      is normally written to user space being shunted off to user
      specified kernel memory instead.
      
      For the immediate repair, detect and deny suspicious accesses to
      the write API.
      
      For long term, update the user space libraries and the kernel API
      to something that doesn't present the same security vulnerabilities
      (likely a structured ioctl() interface).
      
      The impacted uAPI interfaces are generally only available if
      hardware from drivers/infiniband is installed in the system.
      Reported-by: default avatarJann Horn <jann@thejh.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      [ Expanded check to all known write() entry points ]
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      [wt: no hfi1 subdir in 3.10. A minimal rdma/ib.h had to be created
       from 3.11 sources to keep the code similar to mainline]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      55ac348c
    • Jason Gunthorpe's avatar
      IB/mlx4: Properly initialize GRH TClass and FlowLabel in AHs · f40b2f17
      Jason Gunthorpe authored
      commit 8c5122e4 upstream.
      
      When this code was reworked for IBoE support the order of assignments
      for the sl_tclass_flowlabel got flipped around resulting in
      TClass & FlowLabel being permanently set to 0 in the packet headers.
      
      This breaks IB routers that rely on these headers, but only affects
      kernel users - libmlx4 does this properly for user space.
      
      Cc: stable@vger.kernel.org
      Fixes: fa417f7b ("IB/mlx4: Add support for IBoE")
      Signed-off-by: default avatarJason Gunthorpe <jgunthorpe@obsidianresearch.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      f40b2f17
    • Martin Willi's avatar
      mac80211_hwsim: Add missing check for HWSIM_ATTR_SIGNAL · 62cfca04
      Martin Willi authored
      commit 62397da5 upstream.
      
      A wmediumd that does not send this attribute causes a NULL pointer
      dereference, as the attribute is accessed even if it does not exist.
      
      The attribute was required but never checked ever since userspace frame
      forwarding has been introduced. The issue gets more problematic once we
      allow wmediumd registration from user namespaces.
      
      Fixes: 7882513b ("mac80211_hwsim driver support userspace frame tx/rx")
      Signed-off-by: default avatarMartin Willi <martin@strongswan.org>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      62cfca04
    • Bob Copeland's avatar
      mac80211: mesh: flush mesh paths unconditionally · 11d71499
      Bob Copeland authored
      commit fe7a7c57 upstream.
      
      Currently, the mesh paths associated with a nexthop station are cleaned
      up in the following code path:
      
          __sta_info_destroy_part1
          synchronize_net()
          __sta_info_destroy_part2
           -> cleanup_single_sta
             -> mesh_sta_cleanup
               -> mesh_plink_deactivate
                 -> mesh_path_flush_by_nexthop
      
      However, there are a couple of problems here:
      
      1) the paths aren't flushed at all if the MPM is running in userspace
         (e.g. when using wpa_supplicant or authsae)
      
      2) there is no synchronize_rcu between removing the path and readers
         accessing the nexthop, which means the following race is possible:
      
      CPU0                            CPU1
      ~~~~                            ~~~~
                                      sta_info_destroy_part1()
                                      synchronize_net()
      rcu_read_lock()
      mesh_nexthop_resolve()
        mpath = mesh_path_lookup()
                                      [...] -> mesh_path_flush_by_nexthop()
        sta = rcu_dereference(
          mpath->next_hop)
                                      kfree(sta)
        access sta <-- CRASH
      
      Fix both of these by unconditionally flushing paths before destroying
      the sta, and by adding a synchronize_net() after path flush to ensure
      no active readers can still dereference the sta.
      
      Fixes this crash:
      
      [  348.529295] BUG: unable to handle kernel paging request at 00020040
      [  348.530014] IP: [<f929245d>] ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211]
      [  348.530014] *pde = 00000000
      [  348.530014] Oops: 0000 [#1] PREEMPT
      [  348.530014] Modules linked in: drbg ansi_cprng ctr ccm ppp_generic slhc ipt_MASQUERADE nf_nat_masquerade_ipv4 8021q ]
      [  348.530014] CPU: 0 PID: 20597 Comm: wget Tainted: G           O 4.6.0-rc5-wt=V1 #1
      [  348.530014] Hardware name: To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080016  11/07/2014
      [  348.530014] task: f64fa280 ti: f4f9c000 task.ti: f4f9c000
      [  348.530014] EIP: 0060:[<f929245d>] EFLAGS: 00010246 CPU: 0
      [  348.530014] EIP is at ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211]
      [  348.530014] EAX: f4ce63e0 EBX: 00000088 ECX: f3788416 EDX: 00020008
      [  348.530014] ESI: 00000000 EDI: 00000088 EBP: f6409a4c ESP: f6409a40
      [  348.530014]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
      [  348.530014] CR0: 80050033 CR2: 00020040 CR3: 33190000 CR4: 00000690
      [  348.530014] Stack:
      [  348.530014]  00000000 f4ce63e0 f5f9bd80 f6409a64 f9291d80 0000ce67 f5d51e00 f4ce63e0
      [  348.530014]  f3788416 f6409a80 f9291dc1 f4ce8320 f4ce63e0 f5d51e00 f4ce63e0 f4ce8320
      [  348.530014]  f6409a98 f9277f6f 00000000 00000000 0000007c 00000000 f6409b2c f9278dd1
      [  348.530014] Call Trace:
      [  348.530014]  [<f9291d80>] mesh_nexthop_lookup+0xbb/0xc8 [mac80211]
      [  348.530014]  [<f9291dc1>] mesh_nexthop_resolve+0x34/0xd8 [mac80211]
      [  348.530014]  [<f9277f6f>] ieee80211_xmit+0x92/0xc1 [mac80211]
      [  348.530014]  [<f9278dd1>] __ieee80211_subif_start_xmit+0x807/0x83c [mac80211]
      [  348.530014]  [<c04df012>] ? sch_direct_xmit+0xd7/0x1b3
      [  348.530014]  [<c022a8c6>] ? __local_bh_enable_ip+0x5d/0x7b
      [  348.530014]  [<f956870c>] ? nf_nat_ipv4_out+0x4c/0xd0 [nf_nat_ipv4]
      [  348.530014]  [<f957e036>] ? iptable_nat_ipv4_fn+0xf/0xf [iptable_nat]
      [  348.530014]  [<c04c6f45>] ? netif_skb_features+0x14d/0x30a
      [  348.530014]  [<f9278e10>] ieee80211_subif_start_xmit+0xa/0xe [mac80211]
      [  348.530014]  [<c04c769c>] dev_hard_start_xmit+0x1f8/0x267
      [  348.530014]  [<c04c7261>] ?  validate_xmit_skb.isra.120.part.121+0x10/0x253
      [  348.530014]  [<c04defc6>] sch_direct_xmit+0x8b/0x1b3
      [  348.530014]  [<c04c7a9c>] __dev_queue_xmit+0x2c8/0x513
      [  348.530014]  [<c04c7cfb>] dev_queue_xmit+0xa/0xc
      [  348.530014]  [<f91bfc7a>] batadv_send_skb_packet+0xd6/0xec [batman_adv]
      [  348.530014]  [<f91bfdc4>] batadv_send_unicast_skb+0x15/0x4a [batman_adv]
      [  348.530014]  [<f91b5938>] batadv_dat_send_data+0x27e/0x310 [batman_adv]
      [  348.530014]  [<f91c30b5>] ? batadv_tt_global_hash_find.isra.11+0x8/0xa [batman_adv]
      [  348.530014]  [<f91b63f3>] batadv_dat_snoop_outgoing_arp_request+0x208/0x23d [batman_adv]
      [  348.530014]  [<f91c0cd9>] batadv_interface_tx+0x206/0x385 [batman_adv]
      [  348.530014]  [<c04c769c>] dev_hard_start_xmit+0x1f8/0x267
      [  348.530014]  [<c04c7261>] ?  validate_xmit_skb.isra.120.part.121+0x10/0x253
      [  348.530014]  [<c04defc6>] sch_direct_xmit+0x8b/0x1b3
      [  348.530014]  [<c04c7a9c>] __dev_queue_xmit+0x2c8/0x513
      [  348.530014]  [<f80cbd2a>] ? igb_xmit_frame+0x57/0x72 [igb]
      [  348.530014]  [<c04c7cfb>] dev_queue_xmit+0xa/0xc
      [  348.530014]  [<f843a326>] br_dev_queue_push_xmit+0xeb/0xfb [bridge]
      [  348.530014]  [<f843a35f>] br_forward_finish+0x29/0x74 [bridge]
      [  348.530014]  [<f843a23b>] ? deliver_clone+0x3b/0x3b [bridge]
      [  348.530014]  [<f843a714>] __br_forward+0x89/0xe7 [bridge]
      [  348.530014]  [<f843a336>] ? br_dev_queue_push_xmit+0xfb/0xfb [bridge]
      [  348.530014]  [<f843a234>] deliver_clone+0x34/0x3b [bridge]
      [  348.530014]  [<f843a68b>] ? br_flood+0x95/0x95 [bridge]
      [  348.530014]  [<f843a66d>] br_flood+0x77/0x95 [bridge]
      [  348.530014]  [<f843a809>] br_flood_forward+0x13/0x1a [bridge]
      [  348.530014]  [<f843a68b>] ? br_flood+0x95/0x95 [bridge]
      [  348.530014]  [<f843b877>] br_handle_frame_finish+0x392/0x3db [bridge]
      [  348.530014]  [<c04e9b2b>] ? nf_iterate+0x2b/0x6b
      [  348.530014]  [<f843baa6>] br_handle_frame+0x1e6/0x240 [bridge]
      [  348.530014]  [<f843b4e5>] ? br_handle_local_finish+0x6a/0x6a [bridge]
      [  348.530014]  [<c04c4ba0>] __netif_receive_skb_core+0x43a/0x66b
      [  348.530014]  [<f843b8c0>] ? br_handle_frame_finish+0x3db/0x3db [bridge]
      [  348.530014]  [<c023cea4>] ? resched_curr+0x19/0x37
      [  348.530014]  [<c0240707>] ? check_preempt_wakeup+0xbf/0xfe
      [  348.530014]  [<c0255dec>] ? ktime_get_with_offset+0x5c/0xfc
      [  348.530014]  [<c04c4fc1>] __netif_receive_skb+0x47/0x55
      [  348.530014]  [<c04c57ba>] netif_receive_skb_internal+0x40/0x5a
      [  348.530014]  [<c04c61ef>] napi_gro_receive+0x3a/0x94
      [  348.530014]  [<f80ce8d5>] igb_poll+0x6fd/0x9ad [igb]
      [  348.530014]  [<c0242bd8>] ? swake_up_locked+0x14/0x26
      [  348.530014]  [<c04c5d29>] net_rx_action+0xde/0x250
      [  348.530014]  [<c022a743>] __do_softirq+0x8a/0x163
      [  348.530014]  [<c022a6b9>] ? __hrtimer_tasklet_trampoline+0x19/0x19
      [  348.530014]  [<c021100f>] do_softirq_own_stack+0x26/0x2c
      [  348.530014]  <IRQ>
      [  348.530014]  [<c022a957>] irq_exit+0x31/0x6f
      [  348.530014]  [<c0210eb2>] do_IRQ+0x8d/0xa0
      [  348.530014]  [<c058152c>] common_interrupt+0x2c/0x40
      [  348.530014] Code: e7 8c 00 66 81 ff 88 00 75 12 85 d2 75 0e b2 c3 b8 83 e9 29 f9 e8 a7 5f f9 c6 eb 74 66 81 e3 8c 005
      [  348.530014] EIP: [<f929245d>] ieee80211_mps_set_frame_flags+0x40/0xaa [mac80211] SS:ESP 0068:f6409a40
      [  348.530014] CR2: 0000000000020040
      [  348.530014] ---[ end trace 48556ac26779732e ]---
      [  348.530014] Kernel panic - not syncing: Fatal exception in interrupt
      [  348.530014] Kernel Offset: disabled
      Reported-by: default avatarFred Veldini <fred.veldini@gmail.com>
      Tested-by: default avatarFred Veldini <fred.veldini@gmail.com>
      Signed-off-by: default avatarBob Copeland <me@bobcopeland.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      11d71499
    • Feng Tang's avatar
      net: alx: Work around the DMA RX overflow issue · 5bccf771
      Feng Tang authored
      commit 881d0327 upstream.
      
      Note: This is a verified backported patch for stable 4.4 kernel, and it
      could also be applied to 4.3/4.2/4.1/3.18/3.16
      
      There is a problem with alx devices, that the network link will be
      lost in 1-5 minutes after the device is up.
      
      >From debugging without datasheet, we found the error always
      happen when the DMA RX address is set to 0x....fc0, which is very
      likely to be a HW/silicon problem.
      
      This patch will apply rx skb with 64 bytes longer space, and if the
      allocated skb has a 0x...fc0 address, it will use skb_resever(skb, 64)
      to advance the address, so that the RX overflow can be avoided.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=70761Signed-off-by: default avatarFeng Tang <feng.tang@intel.com>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarOle Lukoie <olelukoie@mail.ru>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      5bccf771
    • Tom Goff's avatar
      ipmr/ip6mr: Initialize the last assert time of mfc entries. · 23f2a4ce
      Tom Goff authored
      commit 70a0dec4 upstream.
      
      This fixes wrong-interface signaling on 32-bit platforms for entries
      created when jiffies > 2^31 + MFC_ASSERT_THRESH.
      Signed-off-by: default avatarTom Goff <thomas.goff@ll.mit.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      23f2a4ce
    • Simon Horman's avatar
      sit: correct IP protocol used in ipip6_err · d09f7f9a
      Simon Horman authored
      commit d5d8760b upstream.
      
      Since 32b8a8e5 ("sit: add IPv4 over IPv4 support")
      ipip6_err() may be called for packets whose IP protocol is
      IPPROTO_IPIP as well as those whose IP protocol is IPPROTO_IPV6.
      
      In the case of IPPROTO_IPIP packets the correct protocol value is not
      passed to ipv4_update_pmtu() or ipv4_redirect().
      
      This patch resolves this problem by using the IP protocol of the packet
      rather than a hard-coded value. This appears to be consistent
      with the usage of the protocol of a packet by icmp_socket_deliver()
      the caller of ipip6_err().
      
      I was able to exercise the redirect case by using a setup where an ICMP
      redirect was received for the destination of the encapsulated packet.
      However, it appears that although incorrect the protocol field is not used
      in this case and thus no problem manifests.  On inspection it does not
      appear that a problem will manifest in the fragmentation needed/update pmtu
      case either.
      
      In short I believe this is a cosmetic fix. None the less, the use of
      IPPROTO_IPV6 seems wrong and confusing.
      Reviewed-by: default avatarDinan Gunawardena <dinan.gunawardena@netronome.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@netronome.com>
      Acked-by: default avatarYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d09f7f9a
    • Herbert Xu's avatar
      crypto: scatterwalk - Fix test in scatterwalk_done · 801b100b
      Herbert Xu authored
      commit 5f070e81 upstream.
      
      When there is more data to be processed, the current test in
      scatterwalk_done may prevent us from calling pagedone even when
      we should.
      
      In particular, if we're on an SG entry spanning multiple pages
      where the last page is not a full page, we will incorrectly skip
      calling pagedone on the second last page.
      
      This patch fixes this by adding a separate test for whether we've
      reached the end of a page.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      801b100b
    • Herbert Xu's avatar
      crypto: gcm - Filter out async ghash if necessary · ef15fd6c
      Herbert Xu authored
      commit b30bdfa8 upstream.
      
      As it is if you ask for a sync gcm you may actually end up with
      an async one because it does not filter out async implementations
      of ghash.
      
      This patch fixes this by adding the necessary filter when looking
      for ghash.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ef15fd6c
    • Linus Walleij's avatar
      crypto: ux500 - memmove the right size · 7e2e2115
      Linus Walleij authored
      commit 19ced623 upstream.
      
      The hash buffer is really HASH_BLOCK_SIZE bytes, someone
      must have thought that memmove takes n*u32 words by mistake.
      Tests work as good/bad as before after this patch.
      
      Cc: Joakim Bech <joakim.bech@linaro.org>
      Cc: stable@vger.kernel.org
      Reported-by: default avatarDavid Binderman <linuxdev.baldrick@gmail.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      7e2e2115
    • Al Viro's avatar
      fix d_walk()/non-delayed __d_free() race · 742d555a
      Al Viro authored
      commit 3d56c25e upstream.
      
      Ascend-to-parent logics in d_walk() depends on all encountered child
      dentries not getting freed without an RCU delay.  Unfortunately, in
      quite a few cases it is not true, with hard-to-hit oopsable race as
      the result.
      
      Fortunately, the fix is simiple; right now the rule is "if it ever
      been hashed, freeing must be delayed" and changing it to "if it
      ever had a parent, freeing must be delayed" closes that hole and
      covers all cases the old rule used to cover.  Moreover, pipes and
      sockets remain _not_ covered, so we do not introduce RCU delay in
      the cases which are the reason for having that delay conditional
      in the first place.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [wt: add the required change to __d_materialise_dentry() for kernels
        older than v3.17]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      742d555a
  2. 22 Aug, 2016 8 commits
    • Jann Horn's avatar
      ecryptfs: forbid opening files without mmap handler · 88ac3837
      Jann Horn authored
      commit 2f36db71 upstream.
      
      This prevents users from triggering a stack overflow through a recursive
      invocation of pagefault handling that involves mapping procfs files into
      virtual memory.
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Acked-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      88ac3837
    • Helge Deller's avatar
      parisc: Fix pagefault crash in unaligned __get_user() call · fcf55035
      Helge Deller authored
      commit 8b78f260 upstream.
      
      One of the debian buildd servers had this crash in the syslog without
      any other information:
      
       Unaligned handler failed, ret = -2
       clock_adjtime (pid 22578): Unaligned data reference (code 28)
       CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G  E  4.5.0-2-parisc64-smp #1 Debian 4.5.4-1
       task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000
      
            YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
       PSW: 00001000000001001111100000001111 Tainted: G            E
       r00-03  000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0
       r04-07  00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff
       r08-11  0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4
       r12-15  000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b
       r16-19  0000000000028800 0000000000000001 0000000000000070 00000001bde7c218
       r20-23  0000000000000000 00000001bde7c210 0000000000000002 0000000000000000
       r24-27  0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0
       r28-31  0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218
       sr00-03  0000000001200000 0000000001200000 0000000000000000 0000000001200000
       sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000
      
       IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88
        IIR: 0ca0d089    ISR: 0000000001200000  IOR: 00000000fa6f7fff
        CPU:        1   CR30: 00000001bde7c000 CR31: ffffffffffffffff
        ORIG_R28: 00000002369fe628
        IAOQ[0]: compat_get_timex+0x2dc/0x3c0
        IAOQ[1]: compat_get_timex+0x2e0/0x3c0
        RP(r2): compat_get_timex+0x40/0x3c0
       Backtrace:
        [<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0
        [<0000000040205024>] syscall_exit+0x0/0x14
      
      This means the userspace program clock_adjtime called the clock_adjtime()
      syscall and then crashed inside the compat_get_timex() function.
      Syscalls should never crash programs, but instead return EFAULT.
      
      The IIR register contains the executed instruction, which disassebles
      into "ldw 0(sr3,r5),r9".
      This load-word instruction is part of __get_user() which tried to read the word
      at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in.  The
      unaligned handler is able to emulate all ldw instructions, but it fails if it
      fails to read the source e.g. because of page fault.
      
      The following program reproduces the problem:
      
      #define _GNU_SOURCE
      #include <unistd.h>
      #include <sys/syscall.h>
      #include <sys/mman.h>
      
      int main(void) {
              /* allocate 8k */
              char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
              /* free second half (upper 4k) and make it invalid. */
              munmap(ptr+4096, 4096);
              /* syscall where first int is unaligned and clobbers into invalid memory region */
              /* syscall should return EFAULT */
              return syscall(__NR_clock_adjtime, 0, ptr+4095);
      }
      
      To fix this issue we simply need to check if the faulting instruction address
      is in the exception fixup table when the unaligned handler failed. If it
      is, call the fixup routine instead of crashing.
      
      While looking at the unaligned handler I found another issue as well: The
      target register should not be modified if the handler was unsuccessful.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      fcf55035
    • Dave Weinstein's avatar
      arm: oabi compat: add missing access checks · d948109d
      Dave Weinstein authored
      commit 7de24996 upstream.
      
      Add access checks to sys_oabi_epoll_wait() and sys_oabi_semtimedop().
      This fixes CVE-2016-3857, a local privilege escalation under
      CONFIG_OABI_COMPAT.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarChiachih Wu <wuchiachih@gmail.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarNicolas Pitre <nico@linaro.org>
      Signed-off-by: default avatarDave Weinstein <olorin@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      d948109d
    • Russell King's avatar
      ARM: fix PTRACE_SETVFPREGS on SMP systems · 951b391d
      Russell King authored
      commit e2dfb4b8 upstream.
      
      PTRACE_SETVFPREGS fails to properly mark the VFP register set to be
      reloaded, because it undoes one of the effects of vfp_flush_hwstate().
      
      Specifically vfp_flush_hwstate() sets thread->vfpstate.hard.cpu to
      an invalid CPU number, but vfp_set() overwrites this with the original
      CPU number, thereby rendering the hardware state as apparently "valid",
      even though the software state is more recent.
      
      Fix this by reverting the previous change.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 8130b9d7 ("ARM: 7308/1: vfp: flush thread hwstate before copying ptrace registers")
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarSimon Marchi <simon.marchi@ericsson.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      951b391d
    • Paolo Bonzini's avatar
      KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS · 42bc57b6
      Paolo Bonzini authored
      commit d14bdb55 upstream.
      
      MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to
      any of bits 63:32.  However, this is not detected at KVM_SET_DEBUGREGS
      time, and the next KVM_RUN oopses:
      
         general protection fault: 0000 [#1] SMP
         CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1
         Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
         [...]
         Call Trace:
          [<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm]
          [<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
          [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
          [<ffffffff812418a9>] SyS_ioctl+0x79/0x90
          [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71
         Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
         RIP  [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40
          RSP <ffff88005836bd50>
      
      Testcase (beautified/reduced from syzkaller output):
      
          #include <unistd.h>
          #include <sys/syscall.h>
          #include <string.h>
          #include <stdint.h>
          #include <linux/kvm.h>
          #include <fcntl.h>
          #include <sys/ioctl.h>
      
          long r[8];
      
          int main()
          {
              struct kvm_debugregs dr = { 0 };
      
              r[2] = open("/dev/kvm", O_RDONLY);
              r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
              r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);
      
              memcpy(&dr,
                     "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72"
                     "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8"
                     "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9"
                     "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb",
                     48);
              r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr);
              r[6] = ioctl(r[4], KVM_RUN, 0);
          }
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      42bc57b6
    • Dave Chinner's avatar
      xfs: skip stale inodes in xfs_iflush_cluster · 9eccedc4
      Dave Chinner authored
      commit 7d3aa7fe upstream.
      
      We don't write back stale inodes so we should skip them in
      xfs_iflush_cluster, too.
      
      cc: <stable@vger.kernel.org> # 3.10.x-
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      9eccedc4
    • Dave Chinner's avatar
      xfs: fix inode validity check in xfs_iflush_cluster · 360914d6
      Dave Chinner authored
      commit 51b07f30 upstream.
      
      Some careless idiot(*) wrote crap code in commit 1a3e8f3d ("xfs:
      convert inode cache lookups to use RCU locking") back in late 2010,
      and so xfs_iflush_cluster checks the wrong inode for whether it is
      still valid under RCU protection. Fix it to lock and check the
      correct inode.
      
      (*) Careless-idiot: Dave Chinner <dchinner@redhat.com>
      
      cc: <stable@vger.kernel.org> # 3.10.x-
      Discovered-by: default avatarBrain Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      360914d6
    • Dave Chinner's avatar
      xfs: xfs_iflush_cluster fails to abort on error · 01ee4801
      Dave Chinner authored
      commit b1438f47 upstream.
      
      When a failure due to an inode buffer occurs, the error handling
      fails to abort the inode writeback correctly. This can result in the
      inode being reclaimed whilst still in the AIL, leading to
      use-after-free situations as well as filesystems that cannot be
      unmounted as the inode log items left in the AIL never get removed.
      
      Fix this by ensuring fatal errors from xfs_imap_to_bp() result in
      the inode flush being aborted correctly.
      Reported-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Diagnosed-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Tested-by: default avatarShyam Kaushik <shyam@zadarastorage.com>
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [wt: in kernels < 3.17, the error sign is positive, not negative]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      01ee4801
  3. 21 Aug, 2016 19 commits
    • Ville Syrjälä's avatar
      dma-debug: avoid spinlock recursion when disabling dma-debug · 0ed4547f
      Ville Syrjälä authored
      commit 3017cd63 upstream.
      
      With netconsole (at least) the pr_err("...  disablingn") call can
      recurse back into the dma-debug code, where it'll try to grab
      free_entries_lock again.  Avoid the problem by doing the printk after
      dropping the lock.
      
      Link: http://lkml.kernel.org/r/1463678421-18683-1-git-send-email-ville.syrjala@linux.intel.comSigned-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      0ed4547f
    • Vegard Nossum's avatar
      ext4: fix reference counting bug on block allocation error · e7dcdba7
      Vegard Nossum authored
      commit 554a5ccc upstream.
      
      If we hit this error when mounted with errors=continue or
      errors=remount-ro:
      
          EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata
      
      then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
      continue. However, ext4_mb_release_context() is the wrong thing to call
      here since we are still actually using the allocation context.
      
      Instead, just error out. We could retry the allocation, but there is a
      possibility of getting stuck in an infinite loop instead, so this seems
      safer.
      
      [ Fixed up so we don't return EAGAIN to userspace. --tytso ]
      
      Fixes: 8556e8f3 ("ext4: Don't allow new groups to be added during block allocation")
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: stable@vger.kernel.org
      [wt: 3.10 doesn't have EFSCORRUPTED, but XFS uses EUCLEAN as does 3.14
           on this patch so use this instead]
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      e7dcdba7
    • Vegard Nossum's avatar
      ext4: short-cut orphan cleanup on error · bf86199b
      Vegard Nossum authored
      commit c65d5c6c upstream.
      
      If we encounter a filesystem error during orphan cleanup, we should stop.
      Otherwise, we may end up in an infinite loop where the same inode is
      processed again and again.
      
          EXT4-fs (loop0): warning: checktime reached, running e2fsck is recommended
          EXT4-fs error (device loop0): ext4_mb_generate_buddy:758: group 2, block bitmap and bg descriptor inconsistent: 6117 vs 0 free clusters
          Aborting journal on device loop0-8.
          EXT4-fs (loop0): Remounting filesystem read-only
          EXT4-fs error (device loop0) in ext4_free_blocks:4895: Journal has aborted
          EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
          EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
          EXT4-fs error (device loop0) in ext4_ext_remove_space:3068: IO failure
          EXT4-fs error (device loop0) in ext4_ext_truncate:4667: Journal has aborted
          EXT4-fs error (device loop0) in ext4_orphan_del:2927: Journal has aborted
          EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
          EXT4-fs (loop0): Inode 16 (00000000618192a0): orphan list check failed!
          [...]
          EXT4-fs (loop0): Inode 16 (0000000061819748): orphan list check failed!
          [...]
          EXT4-fs (loop0): Inode 16 (0000000061819bf0): orphan list check failed!
          [...]
      
      See-also: c9eb13a9 ("ext4: fix hang when processing corrupted orphaned inode list")
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      bf86199b
    • Vegard Nossum's avatar
      ext4: don't call ext4_should_journal_data() on the journal inode · 8b6ab35c
      Vegard Nossum authored
      commit 6a7fd522 upstream.
      
      If ext4_fill_super() fails early, it's possible for ext4_evict_inode()
      to call ext4_should_journal_data() before superblock options and flags
      are fully set up.  In that case, the iput() on the journal inode can
      end up causing a BUG().
      
      Work around this problem by reordering the tests so we only call
      ext4_should_journal_data() after we know it's not the journal inode.
      
      Fixes: 2d859db3 ("ext4: fix data corruption in inodes with journalled data")
      Fixes: 2b405bfa ("ext4: fix data=journal fast mount/umount hang")
      Cc: Jan Kara <jack@suse.cz>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      8b6ab35c
    • Vegard Nossum's avatar
      ext4: check for extents that wrap around · 6dc68acb
      Vegard Nossum authored
      commit f70749ca upstream.
      
      An extent with lblock = 4294967295 and len = 1 will pass the
      ext4_valid_extent() test:
      
      	ext4_lblk_t last = lblock + len - 1;
      
      	if (len == 0 || lblock > last)
      		return 0;
      
      since last = 4294967295 + 1 - 1 = 4294967295. This would later trigger
      the BUG_ON(es->es_lblk + es->es_len < es->es_lblk) in ext4_es_end().
      
      We can simplify it by removing the - 1 altogether and changing the test
      to use lblock + len <= lblock, since now if len = 0, then lblock + 0 ==
      lblock and it fails, and if len > 0 then lblock + len > lblock in order
      to pass (i.e. it doesn't overflow).
      
      Fixes: 5946d089 ("ext4: check for overlapping extents in ext4_valid_extent_entries()")
      Fixes: 2f974865 ("ext4: check for zero length extent explicitly")
      Cc: Eryu Guan <guaneryu@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPhil Turnbull <phil.turnbull@oracle.com>
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6dc68acb
    • Vegard Nossum's avatar
      ext4: verify extent header depth · ea8f8498
      Vegard Nossum authored
      commit 7bc94916 upstream.
      
      Although the extent tree depth of 5 should enough be for the worst
      case of 2*32 extents of length 1, the extent tree code does not
      currently to merge nodes which are less than half-full with a sibling
      node, or to shrink the tree depth if possible.  So it's possible, at
      least in theory, for the tree depth to be greater than 5.  However,
      even in the worst case, a tree depth of 32 is highly unlikely, and if
      the file system is maliciously corrupted, an insanely large eh_depth
      can cause memory allocation failures that will trigger kernel warnings
      (here, eh_depth = 65280):
      
          JBD2: ext4.exe wants too many credits credits:195849 rsv_credits:0 max:256
          ------------[ cut here ]------------
          WARNING: CPU: 0 PID: 50 at fs/jbd2/transaction.c:293 start_this_handle+0x569/0x580
          CPU: 0 PID: 50 Comm: ext4.exe Not tainted 4.7.0-rc5+ #508
          Stack:
           604a8947 625badd8 0002fd09 00000000
           60078643 00000000 62623910 601bf9bc
           62623970 6002fc84 626239b0 900000125
          Call Trace:
           [<6001c2dc>] show_stack+0xdc/0x1a0
           [<601bf9bc>] dump_stack+0x2a/0x2e
           [<6002fc84>] __warn+0x114/0x140
           [<6002fdff>] warn_slowpath_null+0x1f/0x30
           [<60165829>] start_this_handle+0x569/0x580
           [<60165d4e>] jbd2__journal_start+0x11e/0x220
           [<60146690>] __ext4_journal_start_sb+0x60/0xa0
           [<60120a81>] ext4_truncate+0x131/0x3a0
           [<60123677>] ext4_setattr+0x757/0x840
           [<600d5d0f>] notify_change+0x16f/0x2a0
           [<600b2b16>] do_truncate+0x76/0xc0
           [<600c3e56>] path_openat+0x806/0x1300
           [<600c55c9>] do_filp_open+0x89/0xf0
           [<600b4074>] do_sys_open+0x134/0x1e0
           [<600b4140>] SyS_open+0x20/0x30
           [<6001ea68>] handle_syscall+0x88/0x90
           [<600295fd>] userspace+0x3fd/0x500
           [<6001ac55>] fork_handler+0x85/0x90
      
          ---[ end trace 08b0b88b6387a244 ]---
      
      [ Commit message modified and the extent tree depath check changed
      from 5 to 32 -- tytso ]
      
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      ea8f8498
    • Nicolai Stange's avatar
      ext4: silence UBSAN in ext4_mb_init() · cb0b53ed
      Nicolai Stange authored
      commit 935244cd upstream.
      
      Currently, in ext4_mb_init(), there's a loop like the following:
      
        do {
          ...
          offset += 1 << (sb->s_blocksize_bits - i);
          i++;
        } while (i <= sb->s_blocksize_bits + 1);
      
      Note that the updated offset is used in the loop's next iteration only.
      
      However, at the last iteration, that is at i == sb->s_blocksize_bits + 1,
      the shift count becomes equal to (unsigned)-1 > 31 (c.f. C99 6.5.7(3))
      and UBSAN reports
      
        UBSAN: Undefined behaviour in fs/ext4/mballoc.c:2621:15
        shift exponent 4294967295 is too large for 32-bit type 'int'
        [...]
        Call Trace:
         [<ffffffff818c4d25>] dump_stack+0xbc/0x117
         [<ffffffff818c4c69>] ? _atomic_dec_and_lock+0x169/0x169
         [<ffffffff819411ab>] ubsan_epilogue+0xd/0x4e
         [<ffffffff81941cac>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
         [<ffffffff81941ab1>] ? __ubsan_handle_load_invalid_value+0x158/0x158
         [<ffffffff814b6dc1>] ? kmem_cache_alloc+0x101/0x390
         [<ffffffff816fc13b>] ? ext4_mb_init+0x13b/0xfd0
         [<ffffffff814293c7>] ? create_cache+0x57/0x1f0
         [<ffffffff8142948a>] ? create_cache+0x11a/0x1f0
         [<ffffffff821c2168>] ? mutex_lock+0x38/0x60
         [<ffffffff821c23ab>] ? mutex_unlock+0x1b/0x50
         [<ffffffff814c26ab>] ? put_online_mems+0x5b/0xc0
         [<ffffffff81429677>] ? kmem_cache_create+0x117/0x2c0
         [<ffffffff816fcc49>] ext4_mb_init+0xc49/0xfd0
         [...]
      
      Observe that the mentioned shift exponent, 4294967295, equals (unsigned)-1.
      
      Unless compilers start to do some fancy transformations (which at least
      GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
      such calculated value of offset is never used again.
      
      Silence UBSAN by introducing another variable, offset_incr, holding the
      next increment to apply to offset and adjust that one by right shifting it
      by one position per loop iteration.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      cb0b53ed
    • Nicolai Stange's avatar
      ext4: address UBSAN warning in mb_find_order_for_block() · 58e32a8d
      Nicolai Stange authored
      commit b5cb316c upstream.
      
      Currently, in mb_find_order_for_block(), there's a loop like the following:
      
        while (order <= e4b->bd_blkbits + 1) {
          ...
          bb += 1 << (e4b->bd_blkbits - order);
        }
      
      Note that the updated bb is used in the loop's next iteration only.
      
      However, at the last iteration, that is at order == e4b->bd_blkbits + 1,
      the shift count becomes negative (c.f. C99 6.5.7(3)) and UBSAN reports
      
        UBSAN: Undefined behaviour in fs/ext4/mballoc.c:1281:11
        shift exponent -1 is negative
        [...]
        Call Trace:
         [<ffffffff818c4d35>] dump_stack+0xbc/0x117
         [<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169
         [<ffffffff819411bb>] ubsan_epilogue+0xd/0x4e
         [<ffffffff81941cbc>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
         [<ffffffff81941ac1>] ? __ubsan_handle_load_invalid_value+0x158/0x158
         [<ffffffff816e93a0>] ? ext4_mb_generate_from_pa+0x590/0x590
         [<ffffffff816502c8>] ? ext4_read_block_bitmap_nowait+0x598/0xe80
         [<ffffffff816e7b7e>] mb_find_order_for_block+0x1ce/0x240
         [...]
      
      Unless compilers start to do some fancy transformations (which at least
      GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
      such calculated value of bb is never used again.
      
      Silence UBSAN by introducing another variable, bb_incr, holding the next
      increment to apply to bb and adjust that one by right shifting it by one
      position per loop iteration.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      58e32a8d
    • Theodore Ts'o's avatar
      ext4: fix hang when processing corrupted orphaned inode list · 3d1658bb
      Theodore Ts'o authored
      commit c9eb13a9 upstream.
      
      If the orphaned inode list contains inode #5, ext4_iget() returns a
      bad inode (since the bootloader inode should never be referenced
      directly).  Because of the bad inode, we end up processing the inode
      repeatedly and this hangs the machine.
      
      This can be reproduced via:
      
         mke2fs -t ext4 /tmp/foo.img 100
         debugfs -w -R "ssv last_orphan 5" /tmp/foo.img
         mount -o loop /tmp/foo.img /mnt
      
      (But don't do this if you are using an unpatched kernel if you care
      about the system staying functional.  :-)
      
      This bug was found by the port of American Fuzzy Lop into the kernel
      to find file system problems[1].  (Since it *only* happens if inode #5
      shows up on the orphan list --- 3, 7, 8, etc. won't do it, it's not
      surprising that AFL needed two hours before it found it.)
      
      [1] http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf
      
      Cc: stable@vger.kernel.org
      Reported by: Vegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      3d1658bb
    • Alex Deucher's avatar
      drm/radeon: fix firmware info version checks · 01537846
      Alex Deucher authored
      commit 3edc38a0 upstream.
      
      Some of the checks didn't handle frev 2 tables properly.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      01537846
    • Lyude's avatar
      drm/radeon: Poll for both connect/disconnect on analog connectors · 14a039bc
      Lyude authored
      commit 14ff8d48 upstream.
      
      DRM_CONNECTOR_POLL_CONNECT only enables polling for connections, not
      disconnections. Because of this, we end up losing hotplug polling for
      analog connectors once they get connected.
      
      Easy way to reproduce:
       - Grab a machine with a radeon GPU and a VGA port
       - Plug a monitor into the VGA port, wait for it to update the connector
         from disconnected to connected
       - Disconnect the monitor on VGA, a hotplug event is never sent for the
         removal of the connector.
      
      Originally, only using DRM_CONNECTOR_POLL_CONNECT might have been a good
      idea since doing VGA polling can sometimes result in having to mess with
      the DAC voltages to figure out whether or not there's actually something
      there since VGA doesn't have HPD. Doing this would have the potential of
      showing visible artifacts on the screen every time we ran a poll while a
      VGA display was connected. Luckily, radeon_vga_detect() only resorts to
      this sort of polling if the poll is forced, and DRM's polling helper
      doesn't force it's polls.
      
      Additionally, this removes some assignments to connector->polled that
      weren't actually doing anything.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLyude <cpaul@redhat.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      14a039bc
    • Alex Deucher's avatar
      drm/radeon: add a delay after ATPX dGPU power off · 6f78f4c5
      Alex Deucher authored
      commit d814b24f upstream.
      
      ATPX dGPU power control requires a 200ms delay between
      power off and on.  This should fix dGPU failures on
      resume from power off.
      Reviewed-by: default avatarHawking Zhang <Hawking.Zhang@amd.com>
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      6f78f4c5
    • Alex Deucher's avatar
      drm/radeon: fix asic initialization for virtualized environments · fee154e5
      Alex Deucher authored
      commit 05082b8b upstream.
      
      When executing in a PCI passthrough based virtuzliation environment, the
      hypervisor will usually attempt to send a PCIe bus reset signal to the
      ASIC when the VM reboots. In this scenario, the card is not correctly
      initialized, but we still consider it to be posted. Therefore, in a
      passthrough based environemnt we should always post the card to guarantee
      it is in a good state for driver initialization.
      
      Ported from amdgpu commit:
      amdgpu: fix asic initialization for virtualized environments
      
      Cc: Andres Rodriguez <andres.rodriguez@amd.com>
      Cc: Alex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      fee154e5
    • Lyude's avatar
      drm/fb_helper: Fix references to dev->mode_config.num_connector · 30e00cbd
      Lyude authored
      commit 255f0e7c upstream.
      
      During boot, MST hotplugs are generally expected (even if no physical
      hotplugging occurs) and result in DRM's connector topology changing.
      This means that using num_connector from the current mode configuration
      can lead to the number of connectors changing under us. This can lead to
      some nasty scenarios in fbcon:
      
      - We allocate an array to the size of dev->mode_config.num_connectors.
      - MST hotplug occurs, dev->mode_config.num_connectors gets incremented.
      - We try to loop through each element in the array using the new value
        of dev->mode_config.num_connectors, and end up going out of bounds
        since dev->mode_config.num_connectors is now larger then the array we
        allocated.
      
      fb_helper->connector_count however, will always remain consistent while
      we do a modeset in fb_helper.
      
      Note: This is just polish for 4.7, Dave Airlie's drm_connector
      refcounting fixed these bugs for real. But it's good enough duct-tape
      for stable kernel backporting, since backporting the refcounting
      changes is way too invasive.
      Signed-off-by: default avatarLyude <cpaul@redhat.com>
      [danvet: Clarify why we need this. Also remove the now unused "dev"
      local variable to appease gcc.]
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: http://patchwork.freedesktop.org/patch/msgid/1463065021-18280-3-git-send-email-cpaul@redhat.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      30e00cbd
    • Itai Handler's avatar
      drm/gma500: Fix possible out of bounds read · 184f1f01
      Itai Handler authored
      commit 7ccca1d5 upstream.
      
      Fix possible out of bounds read, by adding missing comma.
      The code may read pass the end of the dsi_errors array
      when the most significant bit (bit #31) in the intr_stat register
      is set.
      This bug has been detected using CppCheck (static analysis tool).
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarItai Handler <itai_handler@hotmail.com>
      Signed-off-by: default avatarPatrik Jakobsson <patrik.r.jakobsson@gmail.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      184f1f01
    • Tomáš Trnka's avatar
      sunrpc: fix stripping of padded MIC tokens · b3feb526
      Tomáš Trnka authored
      commit c0cb8bf3 upstream.
      
      The length of the GSS MIC token need not be a multiple of four bytes.
      It is then padded by XDR to a multiple of 4 B, but unwrap_integ_data()
      would previously only trim mic.len + 4 B. The remaining up to three
      bytes would then trigger a check in nfs4svc_decode_compoundargs(),
      leading to a "garbage args" error and mount failure:
      
      nfs4svc_decode_compoundargs: compound not properly padded!
      nfsd: failed to decode arguments!
      
      This would prevent older clients using the pre-RFC 4121 MIC format
      (37-byte MIC including a 9-byte OID) from mounting exports from v3.9+
      servers using krb5i.
      
      The trimming was introduced by commit 4c190e2f ("sunrpc: trim off
      trailing checksum before returning decrypted or integrity authenticated
      buffer").
      
      Fixes: 4c190e2f "unrpc: trim off trailing checksum..."
      Signed-off-by: default avatarTomáš Trnka <ttrnka@mail.muni.cz>
      Cc: stable@vger.kernel.org
      Acked-by: default avatarJeff Layton <jlayton@poochiereds.net>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b3feb526
    • Cyril Bur's avatar
      powerpc/tm: Always reclaim in start_thread() for exec() class syscalls · 8110080d
      Cyril Bur authored
      commit 8e96a87c upstream.
      
      Userspace can quite legitimately perform an exec() syscall with a
      suspended transaction. exec() does not return to the old process, rather
      it load a new one and starts that, the expectation therefore is that the
      new process starts not in a transaction. Currently exec() is not treated
      any differently to any other syscall which creates problems.
      
      Firstly it could allow a new process to start with a suspended
      transaction for a binary that no longer exists. This means that the
      checkpointed state won't be valid and if the suspended transaction were
      ever to be resumed and subsequently aborted (a possibility which is
      exceedingly likely as exec()ing will likely doom the transaction) the
      new process will jump to invalid state.
      
      Secondly the incorrect attempt to keep the transactional state while
      still zeroing state for the new process creates at least two TM Bad
      Things. The first triggers on the rfid to return to userspace as
      start_thread() has given the new process a 'clean' MSR but the suspend
      will still be set in the hardware MSR. The second TM Bad Thing triggers
      in __switch_to() as the processor is still transactionally suspended but
      __switch_to() wants to zero the TM sprs for the new process.
      
      This is an example of the outcome of calling exec() with a suspended
      transaction. Note the first 700 is likely the first TM bad thing
      decsribed earlier only the kernel can't report it as we've loaded
      userspace registers. c000000000009980 is the rfid in
      fast_exception_return()
      
        Bad kernel stack pointer 3fffcfa1a370 at c000000000009980
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        CPU: 0 PID: 2006 Comm: tm-execed Not tainted
        NIP: c000000000009980 LR: 0000000000000000 CTR: 0000000000000000
        REGS: c00000003ffefd40 TRAP: 0700   Not tainted
        MSR: 8000000300201031 <SF,ME,IR,DR,LE,TM[SE]>  CR: 00000000  XER: 00000000
        CFAR: c0000000000098b4 SOFTE: 0
        PACATMSCRATCH: b00000010000d033
        GPR00: 0000000000000000 00003fffcfa1a370 0000000000000000 0000000000000000
        GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR12: 00003fff966611c0 0000000000000000 0000000000000000 0000000000000000
        NIP [c000000000009980] fast_exception_return+0xb0/0xb8
        LR [0000000000000000]           (null)
        Call Trace:
        Instruction dump:
        f84d0278 e9a100d8 7c7b03a6 e84101a0 7c4ff120 e8410170 7c5a03a6 e8010070
        e8410080 e8610088 e8810090 e8210078 <4c000024> 48000000 e8610178 88ed023b
      
        Kernel BUG at c000000000043e80 [verbose debug info unavailable]
        Unexpected TM Bad Thing exception at c000000000043e80 (msr 0x201033)
        Oops: Unrecoverable exception, sig: 6 [#2]
        CPU: 0 PID: 2006 Comm: tm-execed Tainted: G      D
        task: c0000000fbea6d80 ti: c00000003ffec000 task.ti: c0000000fb7ec000
        NIP: c000000000043e80 LR: c000000000015a24 CTR: 0000000000000000
        REGS: c00000003ffef7e0 TRAP: 0700   Tainted: G      D
        MSR: 8000000300201033 <SF,ME,IR,DR,RI,LE,TM[SE]>  CR: 28002828  XER: 00000000
        CFAR: c000000000015a20 SOFTE: 0
        PACATMSCRATCH: b00000010000d033
        GPR00: 0000000000000000 c00000003ffefa60 c000000000db5500 c0000000fbead000
        GPR04: 8000000300001033 2222222222222222 2222222222222222 00000000ff160000
        GPR08: 0000000000000000 800000010000d033 c0000000fb7e3ea0 c00000000fe00004
        GPR12: 0000000000002200 c00000000fe00000 0000000000000000 0000000000000000
        GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR20: 0000000000000000 0000000000000000 c0000000fbea7410 00000000ff160000
        GPR24: c0000000ffe1f600 c0000000fbea8700 c0000000fbea8700 c0000000fbead000
        GPR28: c000000000e20198 c0000000fbea6d80 c0000000fbeab680 c0000000fbea6d80
        NIP [c000000000043e80] tm_restore_sprs+0xc/0x1c
        LR [c000000000015a24] __switch_to+0x1f4/0x420
        Call Trace:
        Instruction dump:
        7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
        4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020
      
      This fixes CVE-2016-5828.
      
      Fixes: bc2a9408 ("powerpc: Hook in new transactional memory code")
      Cc: stable@vger.kernel.org # v3.9+
      Signed-off-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      8110080d
    • Gavin Shan's avatar
      powerpc/pseries: Fix PCI config address for DDW · cc80e5c9
      Gavin Shan authored
      commit 8a934efe upstream.
      
      In commit 8445a87f "powerpc/iommu: Remove the dependency on EEH
      struct in DDW mechanism", the PE address was replaced with the PCI
      config address in order to remove dependency on EEH. According to PAPR
      spec, firmware (pHyp or QEMU) should accept "xxBBSSxx" format PCI config
      address, not "xxxxBBSS" provided by the patch. Note that "BB" is PCI bus
      number and "SS" is the combination of slot and function number.
      
      This fixes the PCI address passed to DDW RTAS calls.
      
      Fixes: 8445a87f ("powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism")
      Cc: stable@vger.kernel.org # v3.4+
      Reported-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Tested-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      cc80e5c9
    • Guilherme G. Piccoli's avatar
      powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism · b738ed81
      Guilherme G. Piccoli authored
      commit 8445a87f upstream.
      
      Commit 39baadbf ("powerpc/eeh: Remove eeh information from pci_dn")
      changed the pci_dn struct by removing its EEH-related members.
      As part of this clean-up, DDW mechanism was modified to read the device
      configuration address from eeh_dev struct.
      
      As a consequence, now if we disable EEH mechanism on kernel command-line
      for example, the DDW mechanism will fail, generating a kernel oops by
      dereferencing a NULL pointer (which turns to be the eeh_dev pointer).
      
      This patch just changes the configuration address calculation on DDW
      functions to a manual calculation based on pci_dn members instead of
      using eeh_dev-based address.
      
      No functional changes were made. This was tested on pSeries, both
      in PHyp and qemu guest.
      
      Fixes: 39baadbf ("powerpc/eeh: Remove eeh information from pci_dn")
      Cc: stable@vger.kernel.org # v3.4+
      Reviewed-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarGuilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarWilly Tarreau <w@1wt.eu>
      b738ed81