1. 05 Feb, 2020 22 commits
    • Dan Carpenter's avatar
      Bluetooth: Fix race condition in hci_release_sock() · 71729b05
      Dan Carpenter authored
      commit 11eb85ec upstream.
      
      Syzbot managed to trigger a use after free "KASAN: use-after-free Write
      in hci_sock_bind".  I have reviewed the code manually and one possibly
      cause I have found is that we are not holding lock_sock(sk) when we do
      the hci_dev_put(hdev) in hci_sock_release().  My theory is that the bind
      and the release are racing against each other which results in this use
      after free.
      
      Reported-by: syzbot+eba992608adf3d796bcc@syzkaller.appspotmail.com
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71729b05
    • Zhenzhong Duan's avatar
      ttyprintk: fix a potential deadlock in interrupt context issue · fb566870
      Zhenzhong Duan authored
      commit 9a655c77 upstream.
      
      tpk_write()/tpk_close() could be interrupted when holding a mutex, then
      in timer handler tpk_write() may be called again trying to acquire same
      mutex, lead to deadlock.
      
      Google syzbot reported this issue with CONFIG_DEBUG_ATOMIC_SLEEP
      enabled:
      
      BUG: sleeping function called from invalid context at
      kernel/locking/mutex.c:938
      in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/1
      1 lock held by swapper/1/0:
      ...
      Call Trace:
        <IRQ>
        dump_stack+0x197/0x210
        ___might_sleep.cold+0x1fb/0x23e
        __might_sleep+0x95/0x190
        __mutex_lock+0xc5/0x13c0
        mutex_lock_nested+0x16/0x20
        tpk_write+0x5d/0x340
        resync_tnc+0x1b6/0x320
        call_timer_fn+0x1ac/0x780
        run_timer_softirq+0x6c3/0x1790
        __do_softirq+0x262/0x98c
        irq_exit+0x19b/0x1e0
        smp_apic_timer_interrupt+0x1a3/0x610
        apic_timer_interrupt+0xf/0x20
        </IRQ>
      
      See link https://syzkaller.appspot.com/bug?extid=2eeef62ee31f9460ad65 for
      more details.
      
      Fix it by using spinlock in process context instead of mutex and having
      interrupt disabled in critical section.
      
      Reported-by: syzbot+2eeef62ee31f9460ad65@syzkaller.appspotmail.com
      Signed-off-by: default avatarZhenzhong Duan <zhenzhong.duan@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Link: https://lore.kernel.org/r/20200113034842.435-1-zhenzhong.duan@gmail.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb566870
    • Tetsuo Handa's avatar
      tomoyo: Use atomic_t for statistics counter · 8f1c7fe1
      Tetsuo Handa authored
      commit a8772fad upstream.
      
      syzbot is reporting that there is a race at tomoyo_stat_update() [1].
      Although it is acceptable to fail to track exact number of times policy
      was updated, convert to atomic_t because this is not a hot path.
      
      [1] https://syzkaller.appspot.com/bug?id=a4d7b973972eeed410596e6604580e0133b0fc04Reported-by: default avatarsyzbot <syzbot+efea72d4a0a1d03596cd@syzkaller.appspotmail.com>
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8f1c7fe1
    • Hans Verkuil's avatar
      media: dvb-usb/dvb-usb-urb.c: initialize actlen to 0 · ddba92fa
      Hans Verkuil authored
      commit 569bc8d6 upstream.
      
      This fixes a syzbot failure since actlen could be uninitialized,
      but it was still used.
      
      Syzbot link:
      
      https://syzkaller.appspot.com/bug?extid=6bf9606ee955b646c0e1
      
      Reported-and-tested-by: syzbot+6bf9606ee955b646c0e1@syzkaller.appspotmail.com
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Acked-by: default avatarSean Young <sean@mess.org>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ddba92fa
    • Hans Verkuil's avatar
      media: gspca: zero usb_buf · 3f43d55a
      Hans Verkuil authored
      commit de89d086 upstream.
      
      Allocate gspca_dev->usb_buf with kzalloc instead of kmalloc to
      ensure it is property zeroed. This fixes various syzbot errors
      about uninitialized data.
      
      Syzbot links:
      
      https://syzkaller.appspot.com/bug?extid=32310fc2aea76898d074
      https://syzkaller.appspot.com/bug?extid=99706d6390be1ac542a2
      https://syzkaller.appspot.com/bug?extid=64437af5c781a7f0e08e
      
      Reported-and-tested-by: syzbot+32310fc2aea76898d074@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+99706d6390be1ac542a2@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+64437af5c781a7f0e08e@syzkaller.appspotmail.com
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f43d55a
    • Sean Young's avatar
      media: vp7045: do not read uninitialized values if usb transfer fails · 373403c6
      Sean Young authored
      commit 26cff637 upstream.
      
      It is not a fatal error if reading the mac address or the remote control
      decoder state fails.
      
      Reported-by: syzbot+ec869945d3dde5f33b43@syzkaller.appspotmail.com
      Signed-off-by: default avatarSean Young <sean@mess.org>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      373403c6
    • Sean Young's avatar
      media: af9005: uninitialized variable printked · bb3d4573
      Sean Young authored
      commit 51d0c99b upstream.
      
      If usb_bulk_msg() fails, actual_length can be uninitialized.
      
      Reported-by: syzbot+9d42b7773d2fecd983ab@syzkaller.appspotmail.com
      Signed-off-by: default avatarSean Young <sean@mess.org>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bb3d4573
    • Sean Young's avatar
      media: digitv: don't continue if remote control state can't be read · 12466938
      Sean Young authored
      commit eecc70d2 upstream.
      
      This results in an uninitialized variable read.
      
      Reported-by: syzbot+6bf9606ee955b646c0e1@syzkaller.appspotmail.com
      Signed-off-by: default avatarSean Young <sean@mess.org>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      12466938
    • Jan Kara's avatar
      reiserfs: Fix memory leak of journal device string · 1764dc15
      Jan Kara authored
      commit 5474ca7d upstream.
      
      When a filesystem is mounted with jdev mount option, we store the
      journal device name in an allocated string in superblock. However we
      fail to ever free that string. Fix it.
      
      Reported-by: syzbot+1c6756baf4b16b94d2a6@syzkaller.appspotmail.com
      Fixes: c3aa0776 ("reiserfs: Properly display mount options in /proc/mounts")
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1764dc15
    • Dan Carpenter's avatar
      mm/mempolicy.c: fix out of bounds write in mpol_parse_str() · 732ecd4a
      Dan Carpenter authored
      commit c7a91bc7 upstream.
      
      What we are trying to do is change the '=' character to a NUL terminator
      and then at the end of the function we restore it back to an '='.  The
      problem is there are two error paths where we jump to the end of the
      function before we have replaced the '=' with NUL.
      
      We end up putting the '=' in the wrong place (possibly one element
      before the start of the buffer).
      
      Link: http://lkml.kernel.org/r/20200115055426.vdjwvry44nfug7yy@kili.mountain
      Reported-by: syzbot+e64a13c5369a194d67df@syzkaller.appspotmail.com
      Fixes: 095f1fc4 ("mempolicy: rework shmem mpol parsing and display")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Dmitry Vyukov <dvyukov@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      732ecd4a
    • Theodore Ts'o's avatar
      ext4: validate the debug_want_extra_isize mount option at parse time · cb1702c4
      Theodore Ts'o authored
      commit 9803387c upstream.
      
      Instead of setting s_want_extra_size and then making sure that it is a
      valid value afterwards, validate the field before we set it.  This
      avoids races and other problems when remounting the file system.
      
      Link: https://lore.kernel.org/r/20191215063020.GA11512@mit.edu
      Cc: stable@kernel.org
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reported-and-tested-by: syzbot+4a39a025912b265cacef@syzkaller.appspotmail.com
      Signed-off-by: default avatarZubin Mithra <zsm@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cb1702c4
    • Dirk Behme's avatar
      arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean' · 1f3b1614
      Dirk Behme authored
      commit d7bbd6c1 upstream.
      
      Since v4.3-rc1 commit 0723c05f ("arm64: enable more compressed
      Image formats"), it is possible to build Image.{bz2,lz4,lzma,lzo}
      AArch64 images. However, the commit missed adding support for removing
      those images on 'make ARCH=arm64 (dist)clean'.
      
      Fix this by adding them to the target list.
      Make sure to match the order of the recipes in the makefile.
      
      Cc: stable@vger.kernel.org # v4.3+
      Fixes: 0723c05f ("arm64: enable more compressed Image formats")
      Signed-off-by: default avatarDirk Behme <dirk.behme@de.bosch.com>
      Signed-off-by: default avatarEugeniu Rosca <erosca@de.adit-jv.com>
      Reviewed-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f3b1614
    • Vitaly Chikunov's avatar
      tools lib: Fix builds when glibc contains strlcpy() · 6d6c4c1b
      Vitaly Chikunov authored
      commit 6c4798d3 upstream.
      
      Disable a couple of compilation warnings (which are treated as errors)
      on strlcpy() definition and declaration, allowing users to compile perf
      and kernel (objtool) when:
      
      1. glibc have strlcpy() (such as in ALT Linux since 2004) objtool and
         perf build fails with this (in gcc):
      
        In file included from exec-cmd.c:3:
        tools/include/linux/string.h:20:15: error: redundant redeclaration of ‘strlcpy’ [-Werror=redundant-decls]
           20 | extern size_t strlcpy(char *dest, const char *src, size_t size);
      
      2. clang ignores `-Wredundant-decls', but produces another warning when
         building perf:
      
          CC       util/string.o
        ../lib/string.c:99:8: error: attribute declaration must precede definition [-Werror,-Wignored-attributes]
        size_t __weak strlcpy(char *dest, const char *src, size_t size)
        ../../tools/include/linux/compiler.h:66:34: note: expanded from macro '__weak'
        # define __weak                 __attribute__((weak))
        /usr/include/bits/string_fortified.h:151:8: note: previous definition is here
        __NTH (strlcpy (char *__restrict __dest, const char *__restrict __src,
      
      Committer notes:
      
      The
      
       #pragma GCC diagnostic
      
      directive was introduced in gcc 4.6, so check for that as well.
      
      Fixes: ce990917 ("perf tools: Move strlcpy() from perf to tools/lib/string.c")
      Fixes: 0215d59b ("tools lib: Reinstate strlcpy() header guard with __UCLIBC__")
      Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=118481Signed-off-by: default avatarVitaly Chikunov <vt@altlinux.org>
      Reviewed-by: default avatarDmitry Levin <ldv@altlinux.org>
      Cc: Dmitry Levin <ldv@altlinux.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: kbuild test robot <lkp@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Cc: Vineet Gupta <vineet.gupta1@synopsys.com>
      Link: http://lore.kernel.org/lkml/20191224172029.19690-1-vt@altlinux.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d6c4c1b
    • Chanwoo Choi's avatar
      PM / devfreq: Add new name attribute for sysfs · 1635d4fc
      Chanwoo Choi authored
      commit 2fee1a7c upstream.
      
      The commit 4585fbcb ("PM / devfreq: Modify the device name as devfreq(X) for
      sysfs") changed the node name to devfreq(x). After this commit, it is not
      possible to get the device name through /sys/class/devfreq/devfreq(X)/*.
      
      Add new name attribute in order to get device name.
      
      Cc: stable@vger.kernel.org
      Fixes: 4585fbcb ("PM / devfreq: Modify the device name as devfreq(X) for sysfs")
      Signed-off-by: default avatarChanwoo Choi <cw00.choi@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1635d4fc
    • Andres Freund's avatar
      perf c2c: Fix return type for histogram sorting comparision functions · e292b266
      Andres Freund authored
      commit c1c8013e upstream.
      
      Commit 722ddfde ("perf tools: Fix time sorting") changed - correctly
      so - hist_entry__sort to return int64. Unfortunately several of the
      builtin-c2c.c comparison routines only happened to work due the cast
      caused by the wrong return type.
      
      This causes meaningless ordering of both the cacheline list, and the
      cacheline details page. E.g a simple:
      
        perf c2c record -a sleep 3
        perf c2c report
      
      will result in cacheline table like
        =================================================
                   Shared Data Cache Line Table
        =================================================
        #
        #        ------- Cacheline ----------    Total     Tot  - LLC Load Hitm -  - Store Reference -  - Load Dram -     LLC  Total  - Core Load Hit -  - LLC Load Hit -
        # Index         Address  Node  PA cnt  records    Hitm  Total  Lcl    Rmt  Total  L1Hit  L1Miss     Lcl   Rmt  Ld Miss  Loads    FB    L1   L2     Llc      Rmt
        # .....  ..............  ....  ......  .......  ......  .....  .....  ...  ....   .....  ......  ......  ....  ......   .....  .....  ..... ...  ....     .......
      
              0  0x7f0d27ffba00   N/A       0       52   0.12%     13      6    7    12      12       0       0     7      14      40      4     16    0    0           0
              1  0x7f0d27ff61c0   N/A       0     6353  14.04%   1475    801  674   779     779       0       0   718    1392    5574   1299   1967    0  115           0
              2  0x7f0d26d3ec80   N/A       0       71   0.15%     16      4   12    13      13       0       0    12      24      58      1     20    0    9           0
              3  0x7f0d26d3ec00   N/A       0       98   0.22%     23     17    6    19      19       0       0     6      12      79      0     40    0   10           0
      
      i.e. with the list not being ordered by Total Hitm.
      
      Fixes: 722ddfde ("perf tools: Fix time sorting")
      Signed-off-by: default avatarAndres Freund <andres@anarazel.de>
      Tested-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org # v3.16+
      Link: http://lore.kernel.org/lkml/20200109043030.233746-1-andres@anarazel.deSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e292b266
    • Johan Hovold's avatar
      rsi: fix use-after-free on failed probe and unbind · 94b4f57a
      Johan Hovold authored
      [ Upstream commit e93cd351 ]
      
      Make sure to stop both URBs before returning after failed probe as well
      as on disconnect to avoid use-after-free in the completion handler.
      
      Reported-by: syzbot+b563b7f8dbe8223a51e8@syzkaller.appspotmail.com
      Fixes: a4302bff ("rsi: add bluetooth rx endpoint")
      Fixes: dad0d04f ("rsi: Add RS9113 wireless driver")
      Cc: stable <stable@vger.kernel.org>     # 3.15
      Cc: Siva Rebbagondla <siva.rebbagondla@redpinesignals.com>
      Cc: Prameela Rani Garnepudi <prameela.j04cs@gmail.com>
      Cc: Amitkumar Karwar <amit.karwar@redpinesignals.com>
      Cc: Fariya Fatima <fariyaf@gmail.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      94b4f57a
    • Siva Rebbagondla's avatar
      rsi: add hci detach for hibernation and poweroff · 28fc6259
      Siva Rebbagondla authored
      [ Upstream commit cbde979b ]
      
      As we missed to detach HCI, while entering power off or hibernation,
      an extra hci interface gets created whenever system is woken up, to
      avoid this we added hci_detach() in rsi_disconnect(), rsi_freeze(),
      and rsi_shutdown() functions which are invoked for these tests.
      This patch fixes the issue
      Signed-off-by: default avatarSiva Rebbagondla <siva.rebbagondla@redpinesignals.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      28fc6259
    • Herbert Xu's avatar
      crypto: pcrypt - Fix user-after-free on module unload · 47ef5cb8
      Herbert Xu authored
      [ Upstream commit 07bfd9bd ]
      
      On module unload of pcrypt we must unregister the crypto algorithms
      first and then tear down the padata structure.  As otherwise the
      crypto algorithms are still alive and can be used while the padata
      structure is being freed.
      
      Fixes: 5068c7a8 ("crypto: pcrypt - Add pcrypt crypto...")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      47ef5cb8
    • Xiaochen Shen's avatar
      x86/resctrl: Fix a deadlock due to inaccurate reference · cc071b7c
      Xiaochen Shen authored
      commit 334b0f4e upstream.
      
      There is a race condition which results in a deadlock when rmdir and
      mkdir execute concurrently:
      
      $ ls /sys/fs/resctrl/c1/mon_groups/m1/
      cpus  cpus_list  mon_data  tasks
      
      Thread 1: rmdir /sys/fs/resctrl/c1
      Thread 2: mkdir /sys/fs/resctrl/c1/mon_groups/m1
      
      3 locks held by mkdir/48649:
       #0:  (sb_writers#17){.+.+}, at: [<ffffffffb4ca2aa0>] mnt_want_write+0x20/0x50
       #1:  (&type->i_mutex_dir_key#8/1){+.+.}, at: [<ffffffffb4c8c13b>] filename_create+0x7b/0x170
       #2:  (rdtgroup_mutex){+.+.}, at: [<ffffffffb4a4389d>] rdtgroup_kn_lock_live+0x3d/0x70
      
      4 locks held by rmdir/48652:
       #0:  (sb_writers#17){.+.+}, at: [<ffffffffb4ca2aa0>] mnt_want_write+0x20/0x50
       #1:  (&type->i_mutex_dir_key#8/1){+.+.}, at: [<ffffffffb4c8c3cf>] do_rmdir+0x13f/0x1e0
       #2:  (&type->i_mutex_dir_key#8){++++}, at: [<ffffffffb4c86d5d>] vfs_rmdir+0x4d/0x120
       #3:  (rdtgroup_mutex){+.+.}, at: [<ffffffffb4a4389d>] rdtgroup_kn_lock_live+0x3d/0x70
      
      Thread 1 is deleting control group "c1". Holding rdtgroup_mutex,
      kernfs_remove() removes all kernfs nodes under directory "c1"
      recursively, then waits for sub kernfs node "mon_groups" to drop active
      reference.
      
      Thread 2 is trying to create a subdirectory "m1" in the "mon_groups"
      directory. The wrapper kernfs_iop_mkdir() takes an active reference to
      the "mon_groups" directory but the code drops the active reference to
      the parent directory "c1" instead.
      
      As a result, Thread 1 is blocked on waiting for active reference to drop
      and never release rdtgroup_mutex, while Thread 2 is also blocked on
      trying to get rdtgroup_mutex.
      
      Thread 1 (rdtgroup_rmdir)   Thread 2 (rdtgroup_mkdir)
      (rmdir /sys/fs/resctrl/c1)  (mkdir /sys/fs/resctrl/c1/mon_groups/m1)
      -------------------------   -------------------------
                                  kernfs_iop_mkdir
                                    /*
                                     * kn: "m1", parent_kn: "mon_groups",
                                     * prgrp_kn: parent_kn->parent: "c1",
                                     *
                                     * "mon_groups", parent_kn->active++: 1
                                     */
                                    kernfs_get_active(parent_kn)
      kernfs_iop_rmdir
        /* "c1", kn->active++ */
        kernfs_get_active(kn)
      
        rdtgroup_kn_lock_live
          atomic_inc(&rdtgrp->waitcount)
          /* "c1", kn->active-- */
          kernfs_break_active_protection(kn)
          mutex_lock
      
        rdtgroup_rmdir_ctrl
          free_all_child_rdtgrp
            sentry->flags = RDT_DELETED
      
          rdtgroup_ctrl_remove
            rdtgrp->flags = RDT_DELETED
            kernfs_get(kn)
            kernfs_remove(rdtgrp->kn)
              __kernfs_remove
                /* "mon_groups", sub_kn */
                atomic_add(KN_DEACTIVATED_BIAS, &sub_kn->active)
                kernfs_drain(sub_kn)
                  /*
                   * sub_kn->active == KN_DEACTIVATED_BIAS + 1,
                   * waiting on sub_kn->active to drop, but it
                   * never drops in Thread 2 which is blocked
                   * on getting rdtgroup_mutex.
                   */
      Thread 1 hangs here ---->
                  wait_event(sub_kn->active == KN_DEACTIVATED_BIAS)
                  ...
                                    rdtgroup_mkdir
                                      rdtgroup_mkdir_mon(parent_kn, prgrp_kn)
                                        mkdir_rdt_prepare(parent_kn, prgrp_kn)
                                          rdtgroup_kn_lock_live(prgrp_kn)
                                            atomic_inc(&rdtgrp->waitcount)
                                            /*
                                             * "c1", prgrp_kn->active--
                                             *
                                             * The active reference on "c1" is
                                             * dropped, but not matching the
                                             * actual active reference taken
                                             * on "mon_groups", thus causing
                                             * Thread 1 to wait forever while
                                             * holding rdtgroup_mutex.
                                             */
                                            kernfs_break_active_protection(
                                                                     prgrp_kn)
                                            /*
                                             * Trying to get rdtgroup_mutex
                                             * which is held by Thread 1.
                                             */
      Thread 2 hangs here ---->             mutex_lock
                                            ...
      
      The problem is that the creation of a subdirectory in the "mon_groups"
      directory incorrectly releases the active protection of its parent
      directory instead of itself before it starts waiting for rdtgroup_mutex.
      This is triggered by the rdtgroup_mkdir() flow calling
      rdtgroup_kn_lock_live()/rdtgroup_kn_unlock() with kernfs node of the
      parent control group ("c1") as argument. It should be called with kernfs
      node "mon_groups" instead. What is currently missing is that the
      kn->priv of "mon_groups" is NULL instead of pointing to the rdtgrp.
      
      Fix it by pointing kn->priv to rdtgrp when "mon_groups" is created. Then
      it could be passed to rdtgroup_kn_lock_live()/rdtgroup_kn_unlock()
      instead. And then it operates on the same rdtgroup structure but handles
      the active reference of kernfs node "mon_groups" to prevent deadlock.
      The same changes are also made to the "mon_data" directories.
      
      This results in some unused function parameters that will be cleaned up
      in follow-up patch as the focus here is on the fix only in support of
      backporting efforts.
      
      Backporting notes:
      
      Since upstream commit fa7d9493 ("x86/resctrl: Rename and move rdt
      files to a separate directory"), the file
      arch/x86/kernel/cpu/intel_rdt_rdtgroup.c has been renamed and moved to
      arch/x86/kernel/cpu/resctrl/rdtgroup.c.
      Apply the change against file arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
      for older stable trees.
      
      Fixes: c7d9aac6 ("x86/intel_rdt/cqm: Add mkdir support for RDT monitoring")
      Suggested-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarXiaochen Shen <xiaochen.shen@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Reviewed-by: default avatarTony Luck <tony.luck@intel.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/1578500886-21771-4-git-send-email-xiaochen.shen@intel.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      cc071b7c
    • Xiaochen Shen's avatar
      x86/resctrl: Fix use-after-free due to inaccurate refcount of rdtgroup · 95a41c7b
      Xiaochen Shen authored
      commit 074fadee upstream.
      
      There is a race condition in the following scenario which results in an
      use-after-free issue when reading a monitoring file and deleting the
      parent ctrl_mon group concurrently:
      
      Thread 1 calls atomic_inc() to take refcount of rdtgrp and then calls
      kernfs_break_active_protection() to drop the active reference of kernfs
      node in rdtgroup_kn_lock_live().
      
      In Thread 2, kernfs_remove() is a blocking routine. It waits on all sub
      kernfs nodes to drop the active reference when removing all subtree
      kernfs nodes recursively. Thread 2 could block on kernfs_remove() until
      Thread 1 calls kernfs_break_active_protection(). Only after
      kernfs_remove() completes the refcount of rdtgrp could be trusted.
      
      Before Thread 1 calls atomic_inc() and kernfs_break_active_protection(),
      Thread 2 could call kfree() when the refcount of rdtgrp (sentry) is 0
      instead of 1 due to the race.
      
      In Thread 1, in rdtgroup_kn_unlock(), referring to earlier rdtgrp memory
      (rdtgrp->waitcount) which was already freed in Thread 2 results in
      use-after-free issue.
      
      Thread 1 (rdtgroup_mondata_show)  Thread 2 (rdtgroup_rmdir)
      --------------------------------  -------------------------
      rdtgroup_kn_lock_live
        /*
         * kn active protection until
         * kernfs_break_active_protection(kn)
         */
        rdtgrp = kernfs_to_rdtgroup(kn)
                                        rdtgroup_kn_lock_live
                                          atomic_inc(&rdtgrp->waitcount)
                                          mutex_lock
                                        rdtgroup_rmdir_ctrl
                                          free_all_child_rdtgrp
                                            /*
                                             * sentry->waitcount should be 1
                                             * but is 0 now due to the race.
                                             */
                                            kfree(sentry)*[1]
        /*
         * Only after kernfs_remove()
         * completes, the refcount of
         * rdtgrp could be trusted.
         */
        atomic_inc(&rdtgrp->waitcount)
        /* kn->active-- */
        kernfs_break_active_protection(kn)
                                          rdtgroup_ctrl_remove
                                            rdtgrp->flags = RDT_DELETED
                                            /*
                                             * Blocking routine, wait for
                                             * all sub kernfs nodes to drop
                                             * active reference in
                                             * kernfs_break_active_protection.
                                             */
                                            kernfs_remove(rdtgrp->kn)
                                        rdtgroup_kn_unlock
                                          mutex_unlock
                                          atomic_dec_and_test(
                                                      &rdtgrp->waitcount)
                                          && (flags & RDT_DELETED)
                                            kernfs_unbreak_active_protection(kn)
                                            kfree(rdtgrp)
        mutex_lock
      mon_event_read
      rdtgroup_kn_unlock
        mutex_unlock
        /*
         * Use-after-free: refer to earlier rdtgrp
         * memory which was freed in [1].
         */
        atomic_dec_and_test(&rdtgrp->waitcount)
        && (flags & RDT_DELETED)
          /* kn->active++ */
          kernfs_unbreak_active_protection(kn)
          kfree(rdtgrp)
      
      Fix it by moving free_all_child_rdtgrp() to after kernfs_remove() in
      rdtgroup_rmdir_ctrl() to ensure it has the accurate refcount of rdtgrp.
      
      Backporting notes:
      
      Since upstream commit fa7d9493 ("x86/resctrl: Rename and move rdt
      files to a separate directory"), the file
      arch/x86/kernel/cpu/intel_rdt_rdtgroup.c has been renamed and moved to
      arch/x86/kernel/cpu/resctrl/rdtgroup.c.
      Apply the change against file arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
      for older stable trees.
      
      Fixes: f3cbeaca ("x86/intel_rdt/cqm: Add rmdir support")
      Suggested-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarXiaochen Shen <xiaochen.shen@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Reviewed-by: default avatarTony Luck <tony.luck@intel.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/1578500886-21771-3-git-send-email-xiaochen.shen@intel.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      95a41c7b
    • Xiaochen Shen's avatar
      x86/resctrl: Fix use-after-free when deleting resource groups · 1b006f8c
      Xiaochen Shen authored
      commit b8511ccc upstream.
      
      A resource group (rdtgrp) contains a reference count (rdtgrp->waitcount)
      that indicates how many waiters expect this rdtgrp to exist. Waiters
      could be waiting on rdtgroup_mutex or some work sitting on a task's
      workqueue for when the task returns from kernel mode or exits.
      
      The deletion of a rdtgrp is intended to have two phases:
      
        (1) while holding rdtgroup_mutex the necessary cleanup is done and
        rdtgrp->flags is set to RDT_DELETED,
      
        (2) after releasing the rdtgroup_mutex, the rdtgrp structure is freed
        only if there are no waiters and its flag is set to RDT_DELETED. Upon
        gaining access to rdtgroup_mutex or rdtgrp, a waiter is required to check
        for the RDT_DELETED flag.
      
      When unmounting the resctrl file system or deleting ctrl_mon groups,
      all of the subdirectories are removed and the data structure of rdtgrp
      is forcibly freed without checking rdtgrp->waitcount. If at this point
      there was a waiter on rdtgrp then a use-after-free issue occurs when the
      waiter starts running and accesses the rdtgrp structure it was waiting
      on.
      
      See kfree() calls in [1], [2] and [3] in these two call paths in
      following scenarios:
      (1) rdt_kill_sb() -> rmdir_all_sub() -> free_all_child_rdtgrp()
      (2) rdtgroup_rmdir() -> rdtgroup_rmdir_ctrl() -> free_all_child_rdtgrp()
      
      There are several scenarios that result in use-after-free issue in
      following:
      
      Scenario 1:
      -----------
      In Thread 1, rdtgroup_tasks_write() adds a task_work callback
      move_myself(). If move_myself() is scheduled to execute after Thread 2
      rdt_kill_sb() is finished, referring to earlier rdtgrp memory
      (rdtgrp->waitcount) which was already freed in Thread 2 results in
      use-after-free issue.
      
      Thread 1 (rdtgroup_tasks_write)        Thread 2 (rdt_kill_sb)
      -------------------------------        ----------------------
      rdtgroup_kn_lock_live
        atomic_inc(&rdtgrp->waitcount)
        mutex_lock
      rdtgroup_move_task
        __rdtgroup_move_task
          /*
           * Take an extra refcount, so rdtgrp cannot be freed
           * before the call back move_myself has been invoked
           */
          atomic_inc(&rdtgrp->waitcount)
          /* Callback move_myself will be scheduled for later */
          task_work_add(move_myself)
      rdtgroup_kn_unlock
        mutex_unlock
        atomic_dec_and_test(&rdtgrp->waitcount)
        && (flags & RDT_DELETED)
                                             mutex_lock
                                             rmdir_all_sub
                                               /*
                                                * sentry and rdtgrp are freed
                                                * without checking refcount
                                                */
                                               free_all_child_rdtgrp
                                                 kfree(sentry)*[1]
                                               kfree(rdtgrp)*[2]
                                             mutex_unlock
      /*
       * Callback is scheduled to execute
       * after rdt_kill_sb is finished
       */
      move_myself
        /*
         * Use-after-free: refer to earlier rdtgrp
         * memory which was freed in [1] or [2].
         */
        atomic_dec_and_test(&rdtgrp->waitcount)
        && (flags & RDT_DELETED)
          kfree(rdtgrp)
      
      Scenario 2:
      -----------
      In Thread 1, rdtgroup_tasks_write() adds a task_work callback
      move_myself(). If move_myself() is scheduled to execute after Thread 2
      rdtgroup_rmdir() is finished, referring to earlier rdtgrp memory
      (rdtgrp->waitcount) which was already freed in Thread 2 results in
      use-after-free issue.
      
      Thread 1 (rdtgroup_tasks_write)        Thread 2 (rdtgroup_rmdir)
      -------------------------------        -------------------------
      rdtgroup_kn_lock_live
        atomic_inc(&rdtgrp->waitcount)
        mutex_lock
      rdtgroup_move_task
        __rdtgroup_move_task
          /*
           * Take an extra refcount, so rdtgrp cannot be freed
           * before the call back move_myself has been invoked
           */
          atomic_inc(&rdtgrp->waitcount)
          /* Callback move_myself will be scheduled for later */
          task_work_add(move_myself)
      rdtgroup_kn_unlock
        mutex_unlock
        atomic_dec_and_test(&rdtgrp->waitcount)
        && (flags & RDT_DELETED)
                                             rdtgroup_kn_lock_live
                                               atomic_inc(&rdtgrp->waitcount)
                                               mutex_lock
                                             rdtgroup_rmdir_ctrl
                                               free_all_child_rdtgrp
                                                 /*
                                                  * sentry is freed without
                                                  * checking refcount
                                                  */
                                                 kfree(sentry)*[3]
                                               rdtgroup_ctrl_remove
                                                 rdtgrp->flags = RDT_DELETED
                                             rdtgroup_kn_unlock
                                               mutex_unlock
                                               atomic_dec_and_test(
                                                           &rdtgrp->waitcount)
                                               && (flags & RDT_DELETED)
                                                 kfree(rdtgrp)
      /*
       * Callback is scheduled to execute
       * after rdt_kill_sb is finished
       */
      move_myself
        /*
         * Use-after-free: refer to earlier rdtgrp
         * memory which was freed in [3].
         */
        atomic_dec_and_test(&rdtgrp->waitcount)
        && (flags & RDT_DELETED)
          kfree(rdtgrp)
      
      If CONFIG_DEBUG_SLAB=y, Slab corruption on kmalloc-2k can be observed
      like following. Note that "0x6b" is POISON_FREE after kfree(). The
      corrupted bits "0x6a", "0x64" at offset 0x424 correspond to
      waitcount member of struct rdtgroup which was freed:
      
        Slab corruption (Not tainted): kmalloc-2k start=ffff9504c5b0d000, len=2048
        420: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkjkkkkkkkkkkk
        Single bit error detected. Probably bad RAM.
        Run memtest86+ or a similar memory test tool.
        Next obj: start=ffff9504c5b0d800, len=2048
        000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
        010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      
        Slab corruption (Not tainted): kmalloc-2k start=ffff9504c58ab800, len=2048
        420: 6b 6b 6b 6b 64 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkdkkkkkkkkkkk
        Prev obj: start=ffff9504c58ab000, len=2048
        000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
        010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
      
      Fix this by taking reference count (waitcount) of rdtgrp into account in
      the two call paths that currently do not do so. Instead of always
      freeing the resource group it will only be freed if there are no waiters
      on it. If there are waiters, the resource group will have its flags set
      to RDT_DELETED.
      
      It will be left to the waiter to free the resource group when it starts
      running and finding that it was the last waiter and the resource group
      has been removed (rdtgrp->flags & RDT_DELETED) since. (1) rdt_kill_sb()
      -> rmdir_all_sub() -> free_all_child_rdtgrp() (2) rdtgroup_rmdir() ->
      rdtgroup_rmdir_ctrl() -> free_all_child_rdtgrp()
      
      Backporting notes:
      
      Since upstream commit fa7d9493 ("x86/resctrl: Rename and move rdt
      files to a separate directory"), the file
      arch/x86/kernel/cpu/intel_rdt_rdtgroup.c has been renamed and moved to
      arch/x86/kernel/cpu/resctrl/rdtgroup.c.
      
      Apply the change against file arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
      in older stable trees.
      
      Fixes: f3cbeaca ("x86/intel_rdt/cqm: Add rmdir support")
      Fixes: 60cf5e10 ("x86/intel_rdt: Add mkdir to resctrl file system")
      Suggested-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Signed-off-by: default avatarXiaochen Shen <xiaochen.shen@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarReinette Chatre <reinette.chatre@intel.com>
      Reviewed-by: default avatarTony Luck <tony.luck@intel.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/1578500886-21771-2-git-send-email-xiaochen.shen@intel.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      1b006f8c
    • Al Viro's avatar
      vfs: fix do_last() regression · 8d7a5100
      Al Viro authored
      commit 6404674a upstream.
      
      Brown paperbag time: fetching ->i_uid/->i_mode really should've been
      done from nd->inode.  I even suggested that, but the reason for that has
      slipped through the cracks and I went for dir->d_inode instead - made
      for more "obvious" patch.
      
      Analysis:
      
       - at the entry into do_last() and all the way to step_into(): dir (aka
         nd->path.dentry) is known not to have been freed; so's nd->inode and
         it's equal to dir->d_inode unless we are already doomed to -ECHILD.
         inode of the file to get opened is not known.
      
       - after step_into(): inode of the file to get opened is known; dir
         might be pointing to freed memory/be negative/etc.
      
       - at the call of may_create_in_sticky(): guaranteed to be out of RCU
         mode; inode of the file to get opened is known and pinned; dir might
         be garbage.
      
      The last was the reason for the original patch.  Except that at the
      do_last() entry we can be in RCU mode and it is possible that
      nd->path.dentry->d_inode has already changed under us.
      
      In that case we are going to fail with -ECHILD, but we need to be
      careful; nd->inode is pointing to valid struct inode and it's the same
      as nd->path.dentry->d_inode in "won't fail with -ECHILD" case, so we
      should use that.
      Reported-by: default avatar"Rantala, Tommi T. (Nokia - FI/Espoo)" <tommi.t.rantala@nokia.com>
      Reported-by: syzbot+190005201ced78a74ad6@syzkaller.appspotmail.com
      Wearing-brown-paperbag: Al Viro <viro@zeniv.linux.org.uk>
      Cc: stable@kernel.org
      Fixes: d0cb5018 ("do_last(): fetch directory ->i_mode and ->i_uid before it's too late")
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d7a5100
  2. 01 Feb, 2020 18 commits