1. 03 Jul, 2019 29 commits
  2. 25 Jun, 2019 11 commits
    • Greg Kroah-Hartman's avatar
      Linux 5.1.15 · f0fae702
      Greg Kroah-Hartman authored
      f0fae702
    • Michael Ellerman's avatar
      powerpc/mm/64s/hash: Reallocate context ids on fork · 1d7446de
      Michael Ellerman authored
      commit ca72d883 upstream.
      
      When using the Hash Page Table (HPT) MMU, userspace memory mappings
      are managed at two levels. Firstly in the Linux page tables, much like
      other architectures, and secondly in the SLB (Segment Lookaside
      Buffer) and HPT. It's the SLB and HPT that are actually used by the
      hardware to do translations.
      
      As part of the series adding support for 4PB user virtual address
      space using the hash MMU, we added support for allocating multiple
      "context ids" per process, one for each 512TB chunk of address space.
      These are tracked in an array called extended_id in the mm_context_t
      of a process that has done a mapping above 512TB.
      
      If such a process forks (ie. clone(2) without CLONE_VM set) it's mm is
      copied, including the mm_context_t, and then init_new_context() is
      called to reinitialise parts of the mm_context_t as appropriate to
      separate the address spaces of the two processes.
      
      The key step in ensuring the two processes have separate address
      spaces is to allocate a new context id for the process, this is done
      at the beginning of hash__init_new_context(). If we didn't allocate a
      new context id then the two processes would share mappings as far as
      the SLB and HPT are concerned, even though their Linux page tables
      would be separate.
      
      For mappings above 512TB, which use the extended_id array, we
      neglected to allocate new context ids on fork, meaning the parent and
      child use the same ids and therefore share those mappings even though
      they're supposed to be separate. This can lead to the parent seeing
      writes done by the child, which is essentially memory corruption.
      
      There is an additional exposure which is that if the child process
      exits, all its context ids are freed, including the context ids that
      are still in use by the parent for mappings above 512TB. One or more
      of those ids can then be reallocated to a third process, that process
      can then read/write to the parent's mappings above 512TB. Additionally
      if the freed id is used for the third process's primary context id,
      then the parent is able to read/write to the third process's mappings
      *below* 512TB.
      
      All of these are fundamental failures to enforce separation between
      processes. The only mitigating factor is that the bug only occurs if a
      process creates mappings above 512TB, and most applications still do
      not create such mappings.
      
      Only machines using the hash page table MMU are affected, eg. PowerPC
      970 (G5), PA6T, Power5/6/7/8/9. By default Power9 bare metal machines
      (powernv) use the Radix MMU and are not affected, unless the machine
      has been explicitly booted in HPT mode (using disable_radix on the
      kernel command line). KVM guests on Power9 may be affected if the host
      or guest is configured to use the HPT MMU. LPARs under PowerVM on
      Power9 are affected as they always use the HPT MMU. Kernels built with
      PAGE_SIZE=4K are not affected.
      
      The fix is relatively simple, we need to reallocate context ids for
      all extended mappings on fork.
      
      Fixes: f384796c ("powerpc/mm: Add support for handling > 512TB address in SLB miss")
      Cc: stable@vger.kernel.org # v4.17+
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d7446de
    • James Morse's avatar
      x86/resctrl: Don't stop walking closids when a locksetup group is found · d0dcce78
      James Morse authored
      commit 87d3aa28 upstream.
      
      When a new control group is created __init_one_rdt_domain() walks all
      the other closids to calculate the sets of used and unused bits.
      
      If it discovers a pseudo_locksetup group, it breaks out of the loop.  This
      means any later closid doesn't get its used bits added to used_b.  These
      bits will then get set in unused_b, and added to the new control group's
      configuration, even if they were marked as exclusive for a later closid.
      
      When encountering a pseudo_locksetup group, we should continue. This is
      because "a resource group enters 'pseudo-locked' mode after the schemata is
      written while the resource group is in 'pseudo-locksetup' mode." When we
      find a pseudo_locksetup group, its configuration is expected to be
      overwritten, we can skip it.
      
      Fixes: dfe9674b ("x86/intel_rdt: Enable entering of pseudo-locksetup mode")
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H Peter Avin <hpa@zytor.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20190603172531.178830-1-james.morse@arm.com
      [Dropped comment due to lack of space]
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0dcce78
    • Jouni Malinen's avatar
      mac80211: Do not use stack memory with scatterlist for GMAC · 7a663886
      Jouni Malinen authored
      commit a71fd9da upstream.
      
      ieee80211_aes_gmac() uses the mic argument directly in sg_set_buf() and
      that does not allow use of stack memory (e.g., BUG_ON() is hit in
      sg_set_buf() with CONFIG_DEBUG_SG). BIP GMAC TX side is fine for this
      since it can use the skb data buffer, but the RX side was using a stack
      variable for deriving the local MIC value to compare against the
      received one.
      
      Fix this by allocating heap memory for the mic buffer.
      
      This was found with hwsim test case ap_cipher_bip_gmac_128 hitting that
      BUG_ON() and kernel panic.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJouni Malinen <j@w1.fi>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7a663886
    • Andy Strohman's avatar
      nl80211: fix station_info pertid memory leak · 34e22e35
      Andy Strohman authored
      commit f77bf486 upstream.
      
      When dumping stations, memory allocated for station_info's
      pertid member will leak if the nl80211 header cannot be added to
      the sk_buff due to insufficient tail room.
      
      I noticed this leak in the kmalloc-2048 cache.
      
      Cc: stable@vger.kernel.org
      Fixes: 8689c051 ("cfg80211: dynamically allocate per-tid stats for station info")
      Signed-off-by: default avatarAndy Strohman <andy@uplevelsystems.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34e22e35
    • Yu Wang's avatar
      mac80211: handle deauthentication/disassociation from TDLS peer · b8caf5aa
      Yu Wang authored
      commit 79c92ca4 upstream.
      
      When receiving a deauthentication/disassociation frame from a TDLS
      peer, a station should not disconnect the current AP, but only
      disable the current TDLS link if it's enabled.
      
      Without this change, a TDLS issue can be reproduced by following the
      steps as below:
      
      1. STA-1 and STA-2 are connected to AP, bidirection traffic is running
         between STA-1 and STA-2.
      2. Set up TDLS link between STA-1 and STA-2, stay for a while, then
         teardown TDLS link.
      3. Repeat step #2 and monitor the connection between STA and AP.
      
      During the test, one STA may send a deauthentication/disassociation
      frame to another, after TDLS teardown, with reason code 6/7, which
      means: Class 2/3 frame received from nonassociated STA.
      
      On receive this frame, the receiver STA will disconnect the current
      AP and then reconnect. It's not a expected behavior, purpose of this
      frame should be disabling the TDLS link, not the link with AP.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarYu Wang <yyuwang@codeaurora.org>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8caf5aa
    • Manikanta Pubbisetty's avatar
      {nl,mac}80211: allow 4addr AP operation on crypto controlled devices · 0dd7d335
      Manikanta Pubbisetty authored
      commit 33d915d9 upstream.
      
      As per the current design, in the case of sw crypto controlled devices,
      it is the device which advertises the support for AP/VLAN iftype based
      on it's ability to tranmsit packets encrypted in software
      (In VLAN functionality, group traffic generated for a specific
      VLAN group is always encrypted in software). Commit db3bdcb9
      ("mac80211: allow AP_VLAN operation on crypto controlled devices")
      has introduced this change.
      
      Since 4addr AP operation also uses AP/VLAN iftype, this conditional
      way of advertising AP/VLAN support has broken 4addr AP mode operation on
      crypto controlled devices which do not support VLAN functionality.
      
      In the case of ath10k driver, not all firmwares have support for VLAN
      functionality but all can support 4addr AP operation. Because AP/VLAN
      support is not advertised for these devices, 4addr AP operations are
      also blocked.
      
      Fix this by allowing 4addr operation on devices which do not support
      AP/VLAN iftype but can support 4addr AP operation (decision is based on
      the wiphy flag WIPHY_FLAG_4ADDR_AP).
      
      Cc: stable@vger.kernel.org
      Fixes: db3bdcb9 ("mac80211: allow AP_VLAN operation on crypto controlled devices")
      Signed-off-by: default avatarManikanta Pubbisetty <mpubbise@codeaurora.org>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0dd7d335
    • Johannes Berg's avatar
      mac80211: drop robust management frames from unknown TA · 61113ed9
      Johannes Berg authored
      commit 588f7d39 upstream.
      
      When receiving a robust management frame, drop it if we don't have
      rx->sta since then we don't have a security association and thus
      couldn't possibly validate the frame.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61113ed9
    • Eric Biggers's avatar
      cfg80211: fix memory leak of wiphy device name · 4a6d3e2f
      Eric Biggers authored
      commit 4f488fbc upstream.
      
      In wiphy_new_nm(), if an error occurs after dev_set_name() and
      device_initialize() have already been called, it's necessary to call
      put_device() (via wiphy_free()) to avoid a memory leak.
      
      Reported-by: syzbot+7fddca22578bc67c3fe4@syzkaller.appspotmail.com
      Fixes: 1f87f7d3 ("cfg80211: add rfkill support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a6d3e2f
    • Christian Brauner's avatar
      fs/namespace: fix unprivileged mount propagation · 4bb1bedc
      Christian Brauner authored
      commit d728cf79 upstream.
      
      When propagating mounts across mount namespaces owned by different user
      namespaces it is not possible anymore to move or umount the mount in the
      less privileged mount namespace.
      
      Here is a reproducer:
      
        sudo mount -t tmpfs tmpfs /mnt
        sudo --make-rshared /mnt
      
        # create unprivileged user + mount namespace and preserve propagation
        unshare -U -m --map-root --propagation=unchanged
      
        # now change back to the original mount namespace in another terminal:
        sudo mkdir /mnt/aaa
        sudo mount -t tmpfs tmpfs /mnt/aaa
      
        # now in the unprivileged user + mount namespace
        mount --move /mnt/aaa /opt
      
      Unfortunately, this is a pretty big deal for userspace since this is
      e.g. used to inject mounts into running unprivileged containers.
      So this regression really needs to go away rather quickly.
      
      The problem is that a recent change falsely locked the root of the newly
      added mounts by setting MNT_LOCKED. Fix this by only locking the mounts
      on copy_mnt_ns() and not when adding a new mount.
      
      Fixes: 3bd045cc ("separate copying and locking mount tree on cross-userns copies")
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: <stable@vger.kernel.org>
      Tested-by: default avatarChristian Brauner <christian@brauner.io>
      Acked-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4bb1bedc
    • Andy Lutomirski's avatar
      x86/vdso: Prevent segfaults due to hoisted vclock reads · cf37b1a0
      Andy Lutomirski authored
      commit ff17bbe0 upstream.
      
      GCC 5.5.0 sometimes cleverly hoists reads of the pvclock and/or hvclock
      pages before the vclock mode checks.  This creates a path through
      vclock_gettime() in which no vclock is enabled at all (due to disabled
      TSC on old CPUs, for example) but the pvclock or hvclock page
      nevertheless read.  This will segfault on bare metal.
      
      This fixes commit 459e3a21 ("gcc-9: properly declare the
      {pv,hv}clock_page storage") in the sense that, before that commit, GCC
      didn't seem to generate the offending code.  There was nothing wrong
      with that commit per se, and -stable maintainers should backport this to
      all supported kernels regardless of whether the offending commit was
      present, since the same crash could just as easily be triggered by the
      phase of the moon.
      
      On GCC 9.1.1, this doesn't seem to affect the generated code at all, so
      I'm not too concerned about performance regressions from this fix.
      
      Cc: stable@vger.kernel.org
      Cc: x86@kernel.org
      Cc: Borislav Petkov <bp@alien8.de>
      Reported-by: default avatarDuncan Roe <duncan_roe@optusnet.com.au>
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf37b1a0