1. 25 Aug, 2019 40 commits
    • YueHaibing's avatar
      ocfs2: remove set but not used variable 'last_hash' · 3bed38de
      YueHaibing authored
      [ Upstream commit 7bc36e3c ]
      
      Fixes gcc '-Wunused-but-set-variable' warning:
      
        fs/ocfs2/xattr.c: In function ocfs2_xattr_bucket_find:
        fs/ocfs2/xattr.c:3828:6: warning: variable last_hash set but not used [-Wunused-but-set-variable]
      
      It's never used and can be removed.
      
      Link: http://lkml.kernel.org/r/20190716132110.34836-1-yuehaibing@huawei.comSigned-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Acked-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3bed38de
    • Jack Morgenstein's avatar
      IB/mad: Fix use-after-free in ib mad completion handling · 2c819823
      Jack Morgenstein authored
      [ Upstream commit 770b7d96 ]
      
      We encountered a use-after-free bug when unloading the driver:
      
      [ 3562.116059] BUG: KASAN: use-after-free in ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
      [ 3562.117233] Read of size 4 at addr ffff8882ca5aa868 by task kworker/u13:2/23862
      [ 3562.118385]
      [ 3562.119519] CPU: 2 PID: 23862 Comm: kworker/u13:2 Tainted: G           OE     5.1.0-for-upstream-dbg-2019-05-19_16-44-30-13 #1
      [ 3562.121806] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
      [ 3562.123075] Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
      [ 3562.124383] Call Trace:
      [ 3562.125640]  dump_stack+0x9a/0xeb
      [ 3562.126911]  print_address_description+0xe3/0x2e0
      [ 3562.128223]  ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
      [ 3562.129545]  __kasan_report+0x15c/0x1df
      [ 3562.130866]  ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
      [ 3562.132174]  kasan_report+0xe/0x20
      [ 3562.133514]  ib_mad_post_receive_mads+0xddc/0xed0 [ib_core]
      [ 3562.134835]  ? find_mad_agent+0xa00/0xa00 [ib_core]
      [ 3562.136158]  ? qlist_free_all+0x51/0xb0
      [ 3562.137498]  ? mlx4_ib_sqp_comp_worker+0x1970/0x1970 [mlx4_ib]
      [ 3562.138833]  ? quarantine_reduce+0x1fa/0x270
      [ 3562.140171]  ? kasan_unpoison_shadow+0x30/0x40
      [ 3562.141522]  ib_mad_recv_done+0xdf6/0x3000 [ib_core]
      [ 3562.142880]  ? _raw_spin_unlock_irqrestore+0x46/0x70
      [ 3562.144277]  ? ib_mad_send_done+0x1810/0x1810 [ib_core]
      [ 3562.145649]  ? mlx4_ib_destroy_cq+0x2a0/0x2a0 [mlx4_ib]
      [ 3562.147008]  ? _raw_spin_unlock_irqrestore+0x46/0x70
      [ 3562.148380]  ? debug_object_deactivate+0x2b9/0x4a0
      [ 3562.149814]  __ib_process_cq+0xe2/0x1d0 [ib_core]
      [ 3562.151195]  ib_cq_poll_work+0x45/0xf0 [ib_core]
      [ 3562.152577]  process_one_work+0x90c/0x1860
      [ 3562.153959]  ? pwq_dec_nr_in_flight+0x320/0x320
      [ 3562.155320]  worker_thread+0x87/0xbb0
      [ 3562.156687]  ? __kthread_parkme+0xb6/0x180
      [ 3562.158058]  ? process_one_work+0x1860/0x1860
      [ 3562.159429]  kthread+0x320/0x3e0
      [ 3562.161391]  ? kthread_park+0x120/0x120
      [ 3562.162744]  ret_from_fork+0x24/0x30
      ...
      [ 3562.187615] Freed by task 31682:
      [ 3562.188602]  save_stack+0x19/0x80
      [ 3562.189586]  __kasan_slab_free+0x11d/0x160
      [ 3562.190571]  kfree+0xf5/0x2f0
      [ 3562.191552]  ib_mad_port_close+0x200/0x380 [ib_core]
      [ 3562.192538]  ib_mad_remove_device+0xf0/0x230 [ib_core]
      [ 3562.193538]  remove_client_context+0xa6/0xe0 [ib_core]
      [ 3562.194514]  disable_device+0x14e/0x260 [ib_core]
      [ 3562.195488]  __ib_unregister_device+0x79/0x150 [ib_core]
      [ 3562.196462]  ib_unregister_device+0x21/0x30 [ib_core]
      [ 3562.197439]  mlx4_ib_remove+0x162/0x690 [mlx4_ib]
      [ 3562.198408]  mlx4_remove_device+0x204/0x2c0 [mlx4_core]
      [ 3562.199381]  mlx4_unregister_interface+0x49/0x1d0 [mlx4_core]
      [ 3562.200356]  mlx4_ib_cleanup+0xc/0x1d [mlx4_ib]
      [ 3562.201329]  __x64_sys_delete_module+0x2d2/0x400
      [ 3562.202288]  do_syscall_64+0x95/0x470
      [ 3562.203277]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The problem was that the MAD PD was deallocated before the MAD CQ.
      There was completion work pending for the CQ when the PD got deallocated.
      When the mad completion handling reached procedure
      ib_mad_post_receive_mads(), we got a use-after-free bug in the following
      line of code in that procedure:
         sg_list.lkey = qp_info->port_priv->pd->local_dma_lkey;
      (the pd pointer in the above line is no longer valid, because the
      pd has been deallocated).
      
      We fix this by allocating the PD before the CQ in procedure
      ib_mad_port_open(), and deallocating the PD after freeing the CQ
      in procedure ib_mad_port_close().
      
      Since the CQ completion work queue is flushed during ib_free_cq(),
      no completions will be pending for that CQ when the PD is later
      deallocated.
      
      Note that freeing the CQ before deallocating the PD is the practice
      in the ULPs.
      
      Fixes: 4be90bc6 ("IB/mad: Remove ib_get_dma_mr calls")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Link: https://lore.kernel.org/r/20190801121449.24973-1-leon@kernel.orgSigned-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2c819823
    • Luck, Tony's avatar
      IB/core: Add mitigation for Spectre V1 · 65585fab
      Luck, Tony authored
      [ Upstream commit 61f25982 ]
      
      Some processors may mispredict an array bounds check and
      speculatively access memory that they should not. With
      a user supplied array index we like to play things safe
      by masking the value with the array size before it is
      used as an index.
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Link: https://lore.kernel.org/r/20190731043957.GA1600@agluck-desk2.amr.corp.intel.comSigned-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      65585fab
    • Qian Cai's avatar
      arm64/mm: fix variable 'pud' set but not used · 07a6a928
      Qian Cai authored
      [ Upstream commit 7d4e2dcf ]
      
      GCC throws a warning,
      
      arch/arm64/mm/mmu.c: In function 'pud_free_pmd_page':
      arch/arm64/mm/mmu.c:1033:8: warning: variable 'pud' set but not used
      [-Wunused-but-set-variable]
        pud_t pud;
              ^~~
      
      because pud_table() is a macro and compiled away. Fix it by making it a
      static inline function and for pud_sect() as well.
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      07a6a928
    • Qian Cai's avatar
      arm64/efi: fix variable 'si' set but not used · 7796efd6
      Qian Cai authored
      [ Upstream commit f1d48362 ]
      
      GCC throws out this warning on arm64.
      
      drivers/firmware/efi/libstub/arm-stub.c: In function 'efi_entry':
      drivers/firmware/efi/libstub/arm-stub.c:132:22: warning: variable 'si'
      set but not used [-Wunused-but-set-variable]
      
      Fix it by making free_screen_info() a static inline function.
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7796efd6
    • Masahiro Yamada's avatar
      kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules · 1c335cd1
      Masahiro Yamada authored
      [ Upstream commit cb481993 ]
      
      KBUILD_EXTRA_SYMBOLS makes sense only when building external modules.
      Moreover, the modpost sets 'external_module' if the -e option is given.
      
      I replaced $(patsubst %, -e %,...) with simpler $(addprefix -e,...)
      while I was here.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1c335cd1
    • Miquel Raynal's avatar
      ata: libahci: do not complain in case of deferred probe · 4f62e065
      Miquel Raynal authored
      [ Upstream commit 090bb803 ]
      
      Retrieving PHYs can defer the probe, do not spawn an error when
      -EPROBE_DEFER is returned, it is normal behavior.
      
      Fixes: b1a9edbd ("ata: libahci: allow to use multiple PHYs")
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4f62e065
    • Don Brace's avatar
    • Kees Cook's avatar
      libata: zpodd: Fix small read overflow in zpodd_get_mech_type() · 0623446f
      Kees Cook authored
      [ Upstream commit 71d6c505 ]
      
      Jeffrin reported a KASAN issue:
      
        BUG: KASAN: global-out-of-bounds in ata_exec_internal_sg+0x50f/0xc70
        Read of size 16 at addr ffffffff91f41f80 by task scsi_eh_1/149
        ...
        The buggy address belongs to the variable:
          cdb.48319+0x0/0x40
      
      Much like commit 18c9a99b ("libata: zpodd: small read overflow in
      eject_tray()"), this fixes a cdb[] buffer length, this time in
      zpodd_get_mech_type():
      
      We read from the cdb[] buffer in ata_exec_internal_sg(). It has to be
      ATAPI_CDB_LEN (16) bytes long, but this buffer is only 12 bytes.
      Reported-by: default avatarJeffrin Jose T <jeffrin@rajagiritech.edu.in>
      Fixes: afe75951 ("libata: identify and init ZPODD devices")
      Link: https://lore.kernel.org/lkml/201907181423.E808958@keescook/Tested-by: default avatarJeffrin Jose T <jeffrin@rajagiritech.edu.in>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0623446f
    • Numfor Mbiziwo-Tiapo's avatar
      perf header: Fix use of unitialized value warning · 219db72f
      Numfor Mbiziwo-Tiapo authored
      [ Upstream commit 20f9781f ]
      
      When building our local version of perf with MSAN (Memory Sanitizer) and
      running the perf record command, MSAN throws a use of uninitialized
      value warning in "tools/perf/util/util.c:333:6".
      
      This warning stems from the "buf" variable being passed into "write".
      It originated as the variable "ev" with the type union perf_event*
      defined in the "perf_event__synthesize_attr" function in
      "tools/perf/util/header.c".
      
      In the "perf_event__synthesize_attr" function they allocate space with a malloc
      call using ev, then go on to only assign some of the member variables before
      passing "ev" on as a parameter to the "process" function therefore "ev"
      contains uninitialized memory. Changing the malloc call to zalloc to initialize
      all the members of "ev" which gets rid of the warning.
      
      To reproduce this warning, build perf by running:
      make -C tools/perf CLANG=1 CC=clang EXTRA_CFLAGS="-fsanitize=memory\
       -fsanitize-memory-track-origins"
      
      (Additionally, llvm might have to be installed and clang might have to
      be specified as the compiler - export CC=/usr/bin/clang)
      
      then running:
      tools/perf/perf record -o - ls / | tools/perf/perf --no-pager annotate\
       -i - --stdio
      
      Please see the cover letter for why false positive warnings may be
      generated.
      Signed-off-by: default avatarNumfor Mbiziwo-Tiapo <nums@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Drayton <mbd@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20190724234500.253358-2-nums@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      219db72f
    • Vince Weaver's avatar
      perf header: Fix divide by zero error if f_header.attr_size==0 · 5b9310f3
      Vince Weaver authored
      [ Upstream commit 7622236c ]
      
      So I have been having lots of trouble with hand-crafted perf.data files
      causing segfaults and the like, so I have started fuzzing the perf tool.
      
      First issue found:
      
      If f_header.attr_size is 0 in the perf.data file, then perf will crash
      with a divide-by-zero error.
      
      Committer note:
      
      Added a pr_err() to tell the user why the command failed.
      Signed-off-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1907231100440.14532@macbook-airSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5b9310f3
    • Lucas Stach's avatar
      irqchip/irq-imx-gpcv2: Forward irq type to parent · 632d97a3
      Lucas Stach authored
      [ Upstream commit 9a446ef0 ]
      
      The GPCv2 is a stacked IRQ controller below the ARM GIC. It doesn't
      care about the IRQ type itself, but needs to forward the type to the
      parent IRQ controller, so this one can be configured correctly.
      Signed-off-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      632d97a3
    • YueHaibing's avatar
      xen/pciback: remove set but not used variable 'old_state' · c1f57bed
      YueHaibing authored
      [ Upstream commit 09e088a4 ]
      
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      drivers/xen/xen-pciback/conf_space_capability.c: In function pm_ctrl_write:
      drivers/xen/xen-pciback/conf_space_capability.c:119:25: warning:
       variable old_state set but not used [-Wunused-but-set-variable]
      
      It is never used so can be removed.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c1f57bed
    • Denis Kirjanov's avatar
      net: usb: pegasus: fix improper read if get_registers() fail · 58c33d47
      Denis Kirjanov authored
      commit 224c0497 upstream.
      
      get_registers() may fail with -ENOMEM and in this
      case we can read a garbage from the status variable tmp.
      
      Reported-by: syzbot+3499a83b2d062ae409d4@syzkaller.appspotmail.com
      Signed-off-by: default avatarDenis Kirjanov <kda@linux-powerpc.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58c33d47
    • Oliver Neukum's avatar
      Input: iforce - add sanity checks · b8cab0b8
      Oliver Neukum authored
      commit 849f5ae3 upstream.
      
      The endpoint type should also be checked before a device
      is accepted.
      
      Reported-by: syzbot+5efc10c005014d061a74@syzkaller.appspotmail.com
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8cab0b8
    • Oliver Neukum's avatar
      Input: kbtab - sanity check for endpoint type · 9ab5ae53
      Oliver Neukum authored
      commit c88090df upstream.
      
      The driver should check whether the endpoint it uses has the correct
      type.
      
      Reported-by: syzbot+c7df50363aaff50aa363@syzkaller.appspotmail.com
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ab5ae53
    • Hillf Danton's avatar
      HID: hiddev: do cleanup in failure of opening a device · 963a14fb
      Hillf Danton authored
      commit 6d4472d7 upstream.
      
      Undo what we did for opening before releasing the memory slice.
      Reported-by: default avatarsyzbot <syzbot+62a1e04fd3ec2abf099e@syzkaller.appspotmail.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarHillf Danton <hdanton@sina.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      963a14fb
    • Hillf Danton's avatar
      HID: hiddev: avoid opening a disconnected device · 52aaeae5
      Hillf Danton authored
      commit 9c09b214 upstream.
      
      syzbot found the following crash on:
      
      HEAD commit:    e96407b4 usb-fuzzer: main usb gadget fuzzer driver
      git tree:       https://github.com/google/kasan.git usb-fuzzer
      console output: https://syzkaller.appspot.com/x/log.txt?x=147ac20c600000
      kernel config:  https://syzkaller.appspot.com/x/.config?x=792eb47789f57810
      dashboard link: https://syzkaller.appspot.com/bug?extid=62a1e04fd3ec2abf099e
      compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
      
      ==================================================================
      BUG: KASAN: use-after-free in __lock_acquire+0x302a/0x3b50
      kernel/locking/lockdep.c:3753
      Read of size 8 at addr ffff8881cf591a08 by task syz-executor.1/26260
      
      CPU: 1 PID: 26260 Comm: syz-executor.1 Not tainted 5.3.0-rc2+ #24
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:77 [inline]
        dump_stack+0xca/0x13e lib/dump_stack.c:113
        print_address_description+0x6a/0x32c mm/kasan/report.c:351
        __kasan_report.cold+0x1a/0x33 mm/kasan/report.c:482
        kasan_report+0xe/0x12 mm/kasan/common.c:612
        __lock_acquire+0x302a/0x3b50 kernel/locking/lockdep.c:3753
        lock_acquire+0x127/0x320 kernel/locking/lockdep.c:4412
        __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
        _raw_spin_lock_irqsave+0x32/0x50 kernel/locking/spinlock.c:159
        hiddev_release+0x82/0x520 drivers/hid/usbhid/hiddev.c:221
        __fput+0x2d7/0x840 fs/file_table.c:280
        task_work_run+0x13f/0x1c0 kernel/task_work.c:113
        exit_task_work include/linux/task_work.h:22 [inline]
        do_exit+0x8ef/0x2c50 kernel/exit.c:878
        do_group_exit+0x125/0x340 kernel/exit.c:982
        get_signal+0x466/0x23d0 kernel/signal.c:2728
        do_signal+0x88/0x14e0 arch/x86/kernel/signal.c:815
        exit_to_usermode_loop+0x1a2/0x200 arch/x86/entry/common.c:159
        prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
        syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
        do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x459829
      Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
      48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
      ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f75b2a6ccf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
      RAX: fffffffffffffe00 RBX: 000000000075c078 RCX: 0000000000459829
      RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000000075c078
      RBP: 000000000075c070 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 000000000075c07c
      R13: 00007ffcdfe1023f R14: 00007f75b2a6d9c0 R15: 000000000075c07c
      
      Allocated by task 104:
        save_stack+0x1b/0x80 mm/kasan/common.c:69
        set_track mm/kasan/common.c:77 [inline]
        __kasan_kmalloc mm/kasan/common.c:487 [inline]
        __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:460
        kmalloc include/linux/slab.h:552 [inline]
        kzalloc include/linux/slab.h:748 [inline]
        hiddev_connect+0x242/0x5b0 drivers/hid/usbhid/hiddev.c:900
        hid_connect+0x239/0xbb0 drivers/hid/hid-core.c:1882
        hid_hw_start drivers/hid/hid-core.c:1981 [inline]
        hid_hw_start+0xa2/0x130 drivers/hid/hid-core.c:1972
        appleir_probe+0x13e/0x1a0 drivers/hid/hid-appleir.c:308
        hid_device_probe+0x2be/0x3f0 drivers/hid/hid-core.c:2209
        really_probe+0x281/0x650 drivers/base/dd.c:548
        driver_probe_device+0x101/0x1b0 drivers/base/dd.c:709
        __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:816
        bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454
        __device_attach+0x217/0x360 drivers/base/dd.c:882
        bus_probe_device+0x1e4/0x290 drivers/base/bus.c:514
        device_add+0xae6/0x16f0 drivers/base/core.c:2114
        hid_add_device+0x33c/0x990 drivers/hid/hid-core.c:2365
        usbhid_probe+0xa81/0xfa0 drivers/hid/usbhid/hid-core.c:1386
        usb_probe_interface+0x305/0x7a0 drivers/usb/core/driver.c:361
        really_probe+0x281/0x650 drivers/base/dd.c:548
        driver_probe_device+0x101/0x1b0 drivers/base/dd.c:709
        __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:816
        bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454
        __device_attach+0x217/0x360 drivers/base/dd.c:882
        bus_probe_device+0x1e4/0x290 drivers/base/bus.c:514
        device_add+0xae6/0x16f0 drivers/base/core.c:2114
        usb_set_configuration+0xdf6/0x1670 drivers/usb/core/message.c:2023
        generic_probe+0x9d/0xd5 drivers/usb/core/generic.c:210
        usb_probe_device+0x99/0x100 drivers/usb/core/driver.c:266
        really_probe+0x281/0x650 drivers/base/dd.c:548
        driver_probe_device+0x101/0x1b0 drivers/base/dd.c:709
        __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:816
        bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454
        __device_attach+0x217/0x360 drivers/base/dd.c:882
        bus_probe_device+0x1e4/0x290 drivers/base/bus.c:514
        device_add+0xae6/0x16f0 drivers/base/core.c:2114
        usb_new_device.cold+0x6a4/0xe79 drivers/usb/core/hub.c:2536
        hub_port_connect drivers/usb/core/hub.c:5098 [inline]
        hub_port_connect_change drivers/usb/core/hub.c:5213 [inline]
        port_event drivers/usb/core/hub.c:5359 [inline]
        hub_event+0x1b5c/0x3640 drivers/usb/core/hub.c:5441
        process_one_work+0x92b/0x1530 kernel/workqueue.c:2269
        worker_thread+0x96/0xe20 kernel/workqueue.c:2415
        kthread+0x318/0x420 kernel/kthread.c:255
        ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
      
      Freed by task 104:
        save_stack+0x1b/0x80 mm/kasan/common.c:69
        set_track mm/kasan/common.c:77 [inline]
        __kasan_slab_free+0x130/0x180 mm/kasan/common.c:449
        slab_free_hook mm/slub.c:1423 [inline]
        slab_free_freelist_hook mm/slub.c:1470 [inline]
        slab_free mm/slub.c:3012 [inline]
        kfree+0xe4/0x2f0 mm/slub.c:3953
        hiddev_connect.cold+0x45/0x5c drivers/hid/usbhid/hiddev.c:914
        hid_connect+0x239/0xbb0 drivers/hid/hid-core.c:1882
        hid_hw_start drivers/hid/hid-core.c:1981 [inline]
        hid_hw_start+0xa2/0x130 drivers/hid/hid-core.c:1972
        appleir_probe+0x13e/0x1a0 drivers/hid/hid-appleir.c:308
        hid_device_probe+0x2be/0x3f0 drivers/hid/hid-core.c:2209
        really_probe+0x281/0x650 drivers/base/dd.c:548
        driver_probe_device+0x101/0x1b0 drivers/base/dd.c:709
        __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:816
        bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454
        __device_attach+0x217/0x360 drivers/base/dd.c:882
        bus_probe_device+0x1e4/0x290 drivers/base/bus.c:514
        device_add+0xae6/0x16f0 drivers/base/core.c:2114
        hid_add_device+0x33c/0x990 drivers/hid/hid-core.c:2365
        usbhid_probe+0xa81/0xfa0 drivers/hid/usbhid/hid-core.c:1386
        usb_probe_interface+0x305/0x7a0 drivers/usb/core/driver.c:361
        really_probe+0x281/0x650 drivers/base/dd.c:548
        driver_probe_device+0x101/0x1b0 drivers/base/dd.c:709
        __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:816
        bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454
        __device_attach+0x217/0x360 drivers/base/dd.c:882
        bus_probe_device+0x1e4/0x290 drivers/base/bus.c:514
        device_add+0xae6/0x16f0 drivers/base/core.c:2114
        usb_set_configuration+0xdf6/0x1670 drivers/usb/core/message.c:2023
        generic_probe+0x9d/0xd5 drivers/usb/core/generic.c:210
        usb_probe_device+0x99/0x100 drivers/usb/core/driver.c:266
        really_probe+0x281/0x650 drivers/base/dd.c:548
        driver_probe_device+0x101/0x1b0 drivers/base/dd.c:709
        __device_attach_driver+0x1c2/0x220 drivers/base/dd.c:816
        bus_for_each_drv+0x15c/0x1e0 drivers/base/bus.c:454
        __device_attach+0x217/0x360 drivers/base/dd.c:882
        bus_probe_device+0x1e4/0x290 drivers/base/bus.c:514
        device_add+0xae6/0x16f0 drivers/base/core.c:2114
        usb_new_device.cold+0x6a4/0xe79 drivers/usb/core/hub.c:2536
        hub_port_connect drivers/usb/core/hub.c:5098 [inline]
        hub_port_connect_change drivers/usb/core/hub.c:5213 [inline]
        port_event drivers/usb/core/hub.c:5359 [inline]
        hub_event+0x1b5c/0x3640 drivers/usb/core/hub.c:5441
        process_one_work+0x92b/0x1530 kernel/workqueue.c:2269
        worker_thread+0x96/0xe20 kernel/workqueue.c:2415
        kthread+0x318/0x420 kernel/kthread.c:255
        ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
      
      The buggy address belongs to the object at ffff8881cf591900
        which belongs to the cache kmalloc-512 of size 512
      The buggy address is located 264 bytes inside of
        512-byte region [ffff8881cf591900, ffff8881cf591b00)
      The buggy address belongs to the page:
      page:ffffea00073d6400 refcount:1 mapcount:0 mapping:ffff8881da002500
      index:0x0 compound_mapcount: 0
      flags: 0x200000000010200(slab|head)
      raw: 0200000000010200 0000000000000000 0000000100000001 ffff8881da002500
      raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
        ffff8881cf591900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
        ffff8881cf591980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      > ffff8881cf591a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                             ^
        ffff8881cf591a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
        ffff8881cf591b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      ==================================================================
      
      In order to avoid opening a disconnected device, we need to check exist
      again after acquiring the existance lock, and bail out if necessary.
      Reported-by: default avatarsyzbot <syzbot+62a1e04fd3ec2abf099e@syzkaller.appspotmail.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarHillf Danton <hdanton@sina.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      52aaeae5
    • Oliver Neukum's avatar
      HID: holtek: test for sanity of intfdata · bbbaeba7
      Oliver Neukum authored
      commit 01ec0a5f upstream.
      
      The ioctl handler uses the intfdata of a second interface,
      which may not be present in a broken or malicious device, hence
      the intfdata needs to be checked for NULL.
      
      [jkosina@suse.cz: fix newly added spurious space]
      Reported-by: syzbot+965152643a75a56737be@syzkaller.appspotmail.com
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bbbaeba7
    • Hui Wang's avatar
      ALSA: hda - Let all conexant codec enter D3 when rebooting · f8053ac6
      Hui Wang authored
      commit 401714d9 upstream.
      
      We have 3 new lenovo laptops which have conexant codec 0x14f11f86,
      these 3 laptops also have the noise issue when rebooting, after
      letting the codec enter D3 before rebooting or poweroff, the noise
      disappers.
      
      Instead of adding a new ID again in the reboot_notify(), let us make
      this function apply to all conexant codec. In theory make codec enter
      D3 before rebooting or poweroff is harmless, and I tested this change
      on a couple of other Lenovo laptops which have different conexant
      codecs, there is no side effect so far.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHui Wang <hui.wang@canonical.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f8053ac6
    • Hui Wang's avatar
      ALSA: hda - Add a generic reboot_notify · f3f82e10
      Hui Wang authored
      commit 871b9066 upstream.
      
      Make codec enter D3 before rebooting or poweroff can fix the noise
      issue on some laptops. And in theory it is harmless for all codecs
      to enter D3 before rebooting or poweroff, let us add a generic
      reboot_notify, then realtek and conexant drivers can call this
      function.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHui Wang <hui.wang@canonical.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f3f82e10
    • Wenwen Wang's avatar
      ALSA: hda - Fix a memory leak bug · 3248c089
      Wenwen Wang authored
      commit cfef67f0 upstream.
      
      In snd_hda_parse_generic_codec(), 'spec' is allocated through kzalloc().
      Then, the pin widgets in 'codec' are parsed. However, if the parsing
      process fails, 'spec' is not deallocated, leading to a memory leak.
      
      To fix the above issue, free 'spec' before returning the error.
      
      Fixes: 352f7f91 ("ALSA: hda - Merge Realtek parser code to generic parser")
      Signed-off-by: default avatarWenwen Wang <wenwen@cs.uga.edu>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3248c089
    • Max Filippov's avatar
      xtensa: add missing isync to the cpu_reset TLB code · c6a46c61
      Max Filippov authored
      commit cd8869f4 upstream.
      
      ITLB entry modifications must be followed by the isync instruction
      before the new entries are possibly used. cpu_reset lacks one isync
      between ITLB way 6 initialization and jump to the identity mapping.
      Add missing isync to xtensa cpu_reset.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c6a46c61
    • Florian Westphal's avatar
      netfilter: ctnetlink: don't use conntrack/expect object addresses as id · 1922476b
      Florian Westphal authored
      commit 3c791076 upstream.
      
      else, we leak the addresses to userspace via ctnetlink events
      and dumps.
      
      Compute an ID on demand based on the immutable parts of nf_conn struct.
      
      Another advantage compared to using an address is that there is no
      immediate re-use of the same ID in case the conntrack entry is freed and
      reallocated again immediately.
      
      Fixes: 35832402 ("[NETFILTER]: nf_conntrack_expect: kill unique ID")
      Fixes: 7f85f914 ("[NETFILTER]: nf_conntrack: kill unique ID")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1922476b
    • Eric Dumazet's avatar
      inet: switch IP ID generator to siphash · b97a2f3d
      Eric Dumazet authored
      commit df453700 upstream.
      
      According to Amit Klein and Benny Pinkas, IP ID generation is too weak
      and might be used by attackers.
      
      Even with recent net_hash_mix() fix (netns: provide pure entropy for net_hash_mix())
      having 64bit key and Jenkins hash is risky.
      
      It is time to switch to siphash and its 128bit keys.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAmit Klein <aksecurity@gmail.com>
      Reported-by: default avatarBenny Pinkas <benny@pinkas.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 4.9: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b97a2f3d
    • Jason A. Donenfeld's avatar
      siphash: implement HalfSipHash1-3 for hash tables · 175a407c
      Jason A. Donenfeld authored
      commit 1ae2324f upstream.
      
      HalfSipHash, or hsiphash, is a shortened version of SipHash, which
      generates 32-bit outputs using a weaker 64-bit key. It has *much* lower
      security margins, and shouldn't be used for anything too sensitive, but
      it could be used as a hashtable key function replacement, if the output
      is never exposed, and if the security requirement is not too high.
      
      The goal is to make this something that performance-critical jhash users
      would be willing to use.
      
      On 64-bit machines, HalfSipHash1-3 is slower than SipHash1-3, so we alias
      SipHash1-3 to HalfSipHash1-3 on those systems.
      
      64-bit x86_64:
      [    0.509409] test_siphash:     SipHash2-4 cycles: 4049181
      [    0.510650] test_siphash:     SipHash1-3 cycles: 2512884
      [    0.512205] test_siphash: HalfSipHash1-3 cycles: 3429920
      [    0.512904] test_siphash:    JenkinsHash cycles:  978267
      So, we map hsiphash() -> SipHash1-3
      
      32-bit x86:
      [    0.509868] test_siphash:     SipHash2-4 cycles: 14812892
      [    0.513601] test_siphash:     SipHash1-3 cycles:  9510710
      [    0.515263] test_siphash: HalfSipHash1-3 cycles:  3856157
      [    0.515952] test_siphash:    JenkinsHash cycles:  1148567
      So, we map hsiphash() -> HalfSipHash1-3
      
      hsiphash() is roughly 3 times slower than jhash(), but comes with a
      considerable security improvement.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Reviewed-by: default avatarJean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 4.9 to avoid regression for WireGuard with only half
       the siphash API present]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      175a407c
    • Jason A. Donenfeld's avatar
      siphash: add cryptographically secure PRF · 53e054b3
      Jason A. Donenfeld authored
      commit 2c956a60 upstream.
      
      SipHash is a 64-bit keyed hash function that is actually a
      cryptographically secure PRF, like HMAC. Except SipHash is super fast,
      and is meant to be used as a hashtable keyed lookup function, or as a
      general PRF for short input use cases, such as sequence numbers or RNG
      chaining.
      
      For the first usage:
      
      There are a variety of attacks known as "hashtable poisoning" in which an
      attacker forms some data such that the hash of that data will be the
      same, and then preceeds to fill up all entries of a hashbucket. This is
      a realistic and well-known denial-of-service vector. Currently
      hashtables use jhash, which is fast but not secure, and some kind of
      rotating key scheme (or none at all, which isn't good). SipHash is meant
      as a replacement for jhash in these cases.
      
      There are a modicum of places in the kernel that are vulnerable to
      hashtable poisoning attacks, either via userspace vectors or network
      vectors, and there's not a reliable mechanism inside the kernel at the
      moment to fix it. The first step toward fixing these issues is actually
      getting a secure primitive into the kernel for developers to use. Then
      we can, bit by bit, port things over to it as deemed appropriate.
      
      While SipHash is extremely fast for a cryptographically secure function,
      it is likely a bit slower than the insecure jhash, and so replacements
      will be evaluated on a case-by-case basis based on whether or not the
      difference in speed is negligible and whether or not the current jhash usage
      poses a real security risk.
      
      For the second usage:
      
      A few places in the kernel are using MD5 or SHA1 for creating secure
      sequence numbers, syn cookies, port numbers, or fast random numbers.
      SipHash is a faster and more fitting, and more secure replacement for MD5
      in those situations. Replacing MD5 and SHA1 with SipHash for these uses is
      obvious and straight-forward, and so is submitted along with this patch
      series. There shouldn't be much of a debate over its efficacy.
      
      Dozens of languages are already using this internally for their hash
      tables and PRFs. Some of the BSDs already use this in their kernels.
      SipHash is a widely known high-speed solution to a widely known set of
      problems, and it's time we catch-up.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Reviewed-by: default avatarJean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: David Laight <David.Laight@aculab.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 4.9 as dependency of commits df453700 "inet: switch
       IP ID generator to siphash" and 3c791076 "netfilter: ctnetlink: don't
       use conntrack/expect object addresses as id"]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53e054b3
    • Jason Wang's avatar
      vhost: scsi: add weight support · 02b40edd
      Jason Wang authored
      commit c1ea02f1 upstream.
      
      This patch will check the weight and exit the loop if we exceeds the
      weight. This is useful for preventing scsi kthread from hogging cpu
      which is guest triggerable.
      
      This addresses CVE-2019-3900.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Stefan Hajnoczi <stefanha@redhat.com>
      Fixes: 057cbf49 ("tcm_vhost: Initial merge for vhost level target fabric driver")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      [bwh: Backported to 4.9:
       - Drop changes in vhost_scsi_ctl_handle_vq()
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02b40edd
    • Jason Wang's avatar
      vhost_net: fix possible infinite loop · 4b586288
      Jason Wang authored
      commit e2412c07 upstream.
      
      When the rx buffer is too small for a packet, we will discard the vq
      descriptor and retry it for the next packet:
      
      while ((sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
      					      &busyloop_intr))) {
      ...
      	/* On overrun, truncate and discard */
      	if (unlikely(headcount > UIO_MAXIOV)) {
      		iov_iter_init(&msg.msg_iter, READ, vq->iov, 1, 1);
      		err = sock->ops->recvmsg(sock, &msg,
      					 1, MSG_DONTWAIT | MSG_TRUNC);
      		pr_debug("Discarded rx packet: len %zd\n", sock_len);
      		continue;
      	}
      ...
      }
      
      This makes it possible to trigger a infinite while..continue loop
      through the co-opreation of two VMs like:
      
      1) Malicious VM1 allocate 1 byte rx buffer and try to slow down the
         vhost process as much as possible e.g using indirect descriptors or
         other.
      2) Malicious VM2 generate packets to VM1 as fast as possible
      
      Fixing this by checking against weight at the end of RX and TX
      loop. This also eliminate other similar cases when:
      
      - userspace is consuming the packets in the meanwhile
      - theoretical TOCTOU attack if guest moving avail index back and forth
        to hit the continue after vhost find guest just add new buffers
      
      This addresses CVE-2019-3900.
      
      Fixes: d8316f39 ("vhost: fix total length when packets are too short")
      Fixes: 3a4d5c94 ("vhost_net: a kernel-level virtio server")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      [bwh: Backported to 4.9:
       - Both Tx modes are handled in one loop in handle_tx()
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b586288
    • Jason Wang's avatar
      vhost: introduce vhost_exceeds_weight() · 66c8d9d5
      Jason Wang authored
      commit e82b9b07 upstream.
      
      We used to have vhost_exceeds_weight() for vhost-net to:
      
      - prevent vhost kthread from hogging the cpu
      - balance the time spent between TX and RX
      
      This function could be useful for vsock and scsi as well. So move it
      to vhost.c. Device must specify a weight which counts the number of
      requests, or it can also specific a byte_weight which counts the
      number of bytes that has been processed.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      [bwh: Backported to 4.9:
       - In vhost_net, both Tx modes are handled in one loop in handle_tx()
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66c8d9d5
    • Jason Wang's avatar
      vhost_net: introduce vhost_exceeds_weight() · 6214511f
      Jason Wang authored
      commit 272f35cb upstream.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 4.9: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6214511f
    • Paolo Abeni's avatar
      vhost_net: use packet weight for rx handler, too · 73f768b7
      Paolo Abeni authored
      commit db688c24 upstream.
      
      Similar to commit a2ac9990 ("vhost-net: set packet weight of
      tx polling to 2 * vq size"), we need a packet-based limit for
      handler_rx, too - elsewhere, under rx flood with small packets,
      tx can be delayed for a very long time, even without busypolling.
      
      The pkt limit applied to handle_rx must be the same applied by
      handle_tx, or we will get unfair scheduling between rx and tx.
      Tying such limit to the queue length makes it less effective for
      large queue length values and can introduce large process
      scheduler latencies, so a constant valued is used - likewise
      the existing bytes limit.
      
      The selected limit has been validated with PVP[1] performance
      test with different queue sizes:
      
      queue size		256	512	1024
      
      baseline		366	354	362
      weight 128		715	723	670
      weight 256		740	745	733
      weight 512		600	460	583
      weight 1024		423	427	418
      
      A packet weight of 256 gives peek performances in under all the
      tested scenarios.
      
      No measurable regression in unidirectional performance tests has
      been detected.
      
      [1] https://developers.redhat.com/blog/2017/06/05/measuring-and-comparing-open-vswitch-performance/Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      73f768b7
    • haibinzhang(张海斌)'s avatar
      vhost-net: set packet weight of tx polling to 2 * vq size · 43f7e9b8
      haibinzhang(张海斌) authored
      commit a2ac9990 upstream.
      
      handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy
      polling udp packets with small length(e.g. 1byte udp payload), because setting
      VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet length.
      
      Ping-Latencies shown below were tested between two Virtual Machines using
      netperf (UDP_STREAM, len=1), and then another machine pinged the client:
      
      vq size=256
      Packet-Weight   Ping-Latencies(millisecond)
                         min      avg       max
      Origin           3.319   18.489    57.303
      64               1.643    2.021     2.552
      128              1.825    2.600     3.224
      256              1.997    2.710     4.295
      512              1.860    3.171     4.631
      1024             2.002    4.173     9.056
      2048             2.257    5.650     9.688
      4096             2.093    8.508    15.943
      
      vq size=512
      Packet-Weight   Ping-Latencies(millisecond)
                         min      avg       max
      Origin           6.537   29.177    66.245
      64               2.798    3.614     4.403
      128              2.861    3.820     4.775
      256              3.008    4.018     4.807
      512              3.254    4.523     5.824
      1024             3.079    5.335     7.747
      2048             3.944    8.201    12.762
      4096             4.158   11.057    19.985
      
      Seems pretty consistent, a small dip at 2 VQ sizes.
      Ring size is a hint from device about a burst size it can tolerate. Based on
      benchmarks, set the weight to 2 * vq size.
      
      To evaluate this change, another tests were done using netperf(RR, TX) between
      two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was
      tweaked through qemu. Results shown below does not show obvious changes.
      
      vq size=256 TCP_RR                vq size=512 TCP_RR
      size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
         1/       1/  -7%/        -2%      1/       1/   0%/        -2%
         1/       4/  +1%/         0%      1/       4/  +1%/         0%
         1/       8/  +1%/        -2%      1/       8/   0%/        +1%
        64/       1/  -6%/         0%     64/       1/  +7%/        +3%
        64/       4/   0%/        +2%     64/       4/  -1%/        +1%
        64/       8/   0%/         0%     64/       8/  -1%/        -2%
       256/       1/  -3%/        -4%    256/       1/  -4%/        -2%
       256/       4/  +3%/        +4%    256/       4/  +1%/        +2%
       256/       8/  +2%/         0%    256/       8/  +1%/        -1%
      
      vq size=256 UDP_RR                vq size=512 UDP_RR
      size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
         1/       1/  -5%/        +1%      1/       1/  -3%/        -2%
         1/       4/  +4%/        +1%      1/       4/  -2%/        +2%
         1/       8/  -1%/        -1%      1/       8/  -1%/         0%
        64/       1/  -2%/        -3%     64/       1/  +1%/        +1%
        64/       4/  -5%/        -1%     64/       4/  +2%/         0%
        64/       8/   0%/        -1%     64/       8/  -2%/        +1%
       256/       1/  +7%/        +1%    256/       1/  -7%/         0%
       256/       4/  +1%/        +1%    256/       4/  -3%/        -4%
       256/       8/  +2%/        +2%    256/       8/  +1%/        +1%
      
      vq size=256 TCP_STREAM            vq size=512 TCP_STREAM
      size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
        64/       1/   0%/        -3%     64/       1/   0%/         0%
        64/       4/  +3%/        -1%     64/       4/  -2%/        +4%
        64/       8/  +9%/        -4%     64/       8/  -1%/        +2%
       256/       1/  +1%/        -4%    256/       1/  +1%/        +1%
       256/       4/  -1%/        -1%    256/       4/  -3%/         0%
       256/       8/  +7%/        +5%    256/       8/  -3%/         0%
       512/       1/  +1%/         0%    512/       1/  -1%/        -1%
       512/       4/  +1%/        -1%    512/       4/   0%/         0%
       512/       8/  +7%/        -5%    512/       8/  +6%/        -1%
      1024/       1/   0%/        -1%   1024/       1/   0%/        +1%
      1024/       4/  +3%/         0%   1024/       4/  +1%/         0%
      1024/       8/  +8%/        +5%   1024/       8/  -1%/         0%
      2048/       1/  +2%/        +2%   2048/       1/  -1%/         0%
      2048/       4/  +1%/         0%   2048/       4/   0%/        -1%
      2048/       8/  -2%/         0%   2048/       8/   5%/        -1%
      4096/       1/  -2%/         0%   4096/       1/  -2%/         0%
      4096/       4/  +2%/         0%   4096/       4/   0%/         0%
      4096/       8/  +9%/        -2%   4096/       8/  -5%/        -1%
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarHaibin Zhang <haibinzhang@tencent.com>
      Signed-off-by: default avatarYunfang Tai <yunfangtai@tencent.com>
      Signed-off-by: default avatarLidong Chen <lidongchen@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43f7e9b8
    • Daniel Borkmann's avatar
      bpf: add bpf_jit_limit knob to restrict unpriv allocations · c98446e1
      Daniel Borkmann authored
      commit ede95a63 upstream.
      
      Rick reported that the BPF JIT could potentially fill the entire module
      space with BPF programs from unprivileged users which would prevent later
      attempts to load normal kernel modules or privileged BPF programs, for
      example. If JIT was enabled but unsuccessful to generate the image, then
      before commit 290af866 ("bpf: introduce BPF_JIT_ALWAYS_ON config")
      we would always fall back to the BPF interpreter. Nowadays in the case
      where the CONFIG_BPF_JIT_ALWAYS_ON could be set, then the load will abort
      with a failure since the BPF interpreter was compiled out.
      
      Add a global limit and enforce it for unprivileged users such that in case
      of BPF interpreter compiled out we fail once the limit has been reached
      or we fall back to BPF interpreter earlier w/o using module mem if latter
      was compiled in. In a next step, fair share among unprivileged users can
      be resolved in particular for the case where we would fail hard once limit
      is reached.
      
      Fixes: 290af866 ("bpf: introduce BPF_JIT_ALWAYS_ON config")
      Fixes: 0a14842f ("net: filter: Just In Time compiler for x86-64")
      Co-Developed-by: default avatarRick Edgecombe <rick.p.edgecombe@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: LKML <linux-kernel@vger.kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      [bwh: Backported to 4.9: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c98446e1
    • Daniel Borkmann's avatar
      bpf: restrict access to core bpf sysctls · 4d047570
      Daniel Borkmann authored
      commit 2e4a3098 upstream.
      
      Given BPF reaches far beyond just networking these days, it was
      never intended to allow setting and in some cases reading those
      knobs out of a user namespace root running without CAP_SYS_ADMIN,
      thus tighten such access.
      
      Also the bpf_jit_enable = 2 debugging mode should only be allowed
      if kptr_restrict is not set since it otherwise can leak addresses
      to the kernel log. Dump a note to the kernel log that this is for
      debugging JITs only when enabled.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      [bwh: Backported to 4.9:
       - We don't have bpf_dump_raw_ok(), so drop the condition based on it. This
         condition only made it a bit harder for a privileged user to do something
         silly.
       - Drop change to bpf_jit_kallsyms]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d047570
    • Daniel Borkmann's avatar
      bpf: get rid of pure_initcall dependency to enable jits · 5124abda
      Daniel Borkmann authored
      commit fa9dd599 upstream.
      
      Having a pure_initcall() callback just to permanently enable BPF
      JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
      a small race window in future where JIT is still disabled on boot.
      Since we know about the setting at compilation time anyway, just
      initialize it properly there. Also consolidate all the individual
      bpf_jit_enable variables into a single one and move them under one
      location. Moreover, don't allow for setting unspecified garbage
      values on them.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      [bwh: Backported to 4.9 as dependency of commit 2e4a3098
       "bpf: restrict access to core bpf sysctls":
       - Drop change in arch/mips/net/ebpf_jit.c
       - Drop change to bpf_jit_kallsyms
       - Adjust filenames, context]
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5124abda
    • Miles Chen's avatar
      mm/memcontrol.c: fix use after free in mem_cgroup_iter() · 43729e6f
      Miles Chen authored
      commit 54a83d6b upstream.
      
      This patch is sent to report an use after free in mem_cgroup_iter()
      after merging commit be265775 ("mm: memcg: fix use after free in
      mem_cgroup_iter()").
      
      I work with android kernel tree (4.9 & 4.14), and commit be265775
      ("mm: memcg: fix use after free in mem_cgroup_iter()") has been merged
      to the trees.  However, I can still observe use after free issues
      addressed in the commit be265775.  (on low-end devices, a few times
      this month)
      
      backtrace:
              css_tryget <- crash here
              mem_cgroup_iter
              shrink_node
              shrink_zones
              do_try_to_free_pages
              try_to_free_pages
              __perform_reclaim
              __alloc_pages_direct_reclaim
              __alloc_pages_slowpath
              __alloc_pages_nodemask
      
      To debug, I poisoned mem_cgroup before freeing it:
      
        static void __mem_cgroup_free(struct mem_cgroup *memcg)
              for_each_node(node)
              free_mem_cgroup_per_node_info(memcg, node);
              free_percpu(memcg->stat);
        +     /* poison memcg before freeing it */
        +     memset(memcg, 0x78, sizeof(struct mem_cgroup));
              kfree(memcg);
        }
      
      The coredump shows the position=0xdbbc2a00 is freed.
      
        (gdb) p/x ((struct mem_cgroup_per_node *)0xe5009e00)->iter[8]
        $13 = {position = 0xdbbc2a00, generation = 0x2efd}
      
        0xdbbc2a00:     0xdbbc2e00      0x00000000      0xdbbc2800      0x00000100
        0xdbbc2a10:     0x00000200      0x78787878      0x00026218      0x00000000
        0xdbbc2a20:     0xdcad6000      0x00000001      0x78787800      0x00000000
        0xdbbc2a30:     0x78780000      0x00000000      0x0068fb84      0x78787878
        0xdbbc2a40:     0x78787878      0x78787878      0x78787878      0xe3fa5cc0
        0xdbbc2a50:     0x78787878      0x78787878      0x00000000      0x00000000
        0xdbbc2a60:     0x00000000      0x00000000      0x00000000      0x00000000
        0xdbbc2a70:     0x00000000      0x00000000      0x00000000      0x00000000
        0xdbbc2a80:     0x00000000      0x00000000      0x00000000      0x00000000
        0xdbbc2a90:     0x00000001      0x00000000      0x00000000      0x00100000
        0xdbbc2aa0:     0x00000001      0xdbbc2ac8      0x00000000      0x00000000
        0xdbbc2ab0:     0x00000000      0x00000000      0x00000000      0x00000000
        0xdbbc2ac0:     0x00000000      0x00000000      0xe5b02618      0x00001000
        0xdbbc2ad0:     0x00000000      0x78787878      0x78787878      0x78787878
        0xdbbc2ae0:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2af0:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2b00:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2b10:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2b20:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2b30:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2b40:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2b50:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2b60:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2b70:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2b80:     0x78787878      0x78787878      0x00000000      0x78787878
        0xdbbc2b90:     0x78787878      0x78787878      0x78787878      0x78787878
        0xdbbc2ba0:     0x78787878      0x78787878      0x78787878      0x78787878
      
      In the reclaim path, try_to_free_pages() does not setup
      sc.target_mem_cgroup and sc is passed to do_try_to_free_pages(), ...,
      shrink_node().
      
      In mem_cgroup_iter(), root is set to root_mem_cgroup because
      sc->target_mem_cgroup is NULL.  It is possible to assign a memcg to
      root_mem_cgroup.nodeinfo.iter in mem_cgroup_iter().
      
              try_to_free_pages
              	struct scan_control sc = {...}, target_mem_cgroup is 0x0;
              do_try_to_free_pages
              shrink_zones
              shrink_node
              	 mem_cgroup *root = sc->target_mem_cgroup;
              	 memcg = mem_cgroup_iter(root, NULL, &reclaim);
              mem_cgroup_iter()
              	if (!root)
              		root = root_mem_cgroup;
              	...
      
              	css = css_next_descendant_pre(css, &root->css);
              	memcg = mem_cgroup_from_css(css);
              	cmpxchg(&iter->position, pos, memcg);
      
      My device uses memcg non-hierarchical mode.  When we release a memcg:
      invalidate_reclaim_iterators() reaches only dead_memcg and its parents.
      If non-hierarchical mode is used, invalidate_reclaim_iterators() never
      reaches root_mem_cgroup.
      
        static void invalidate_reclaim_iterators(struct mem_cgroup *dead_memcg)
        {
              struct mem_cgroup *memcg = dead_memcg;
      
              for (; memcg; memcg = parent_mem_cgroup(memcg)
              ...
        }
      
      So the use after free scenario looks like:
      
        CPU1						CPU2
      
        try_to_free_pages
        do_try_to_free_pages
        shrink_zones
        shrink_node
        mem_cgroup_iter()
            if (!root)
            	root = root_mem_cgroup;
            ...
            css = css_next_descendant_pre(css, &root->css);
            memcg = mem_cgroup_from_css(css);
            cmpxchg(&iter->position, pos, memcg);
      
              				invalidate_reclaim_iterators(memcg);
              				...
              				__mem_cgroup_free()
              					kfree(memcg);
      
        try_to_free_pages
        do_try_to_free_pages
        shrink_zones
        shrink_node
        mem_cgroup_iter()
            if (!root)
            	root = root_mem_cgroup;
            ...
            mz = mem_cgroup_nodeinfo(root, reclaim->pgdat->node_id);
            iter = &mz->iter[reclaim->priority];
            pos = READ_ONCE(iter->position);
            css_tryget(&pos->css) <- use after free
      
      To avoid this, we should also invalidate root_mem_cgroup.nodeinfo.iter
      in invalidate_reclaim_iterators().
      
      [cai@lca.pw: fix -Wparentheses compilation warning]
        Link: http://lkml.kernel.org/r/1564580753-17531-1-git-send-email-cai@lca.pw
      Link: http://lkml.kernel.org/r/20190730015729.4406-1-miles.chen@mediatek.com
      Fixes: 5ac8fb31 ("mm: memcontrol: convert reclaim iterator to simple css refcounting")
      Signed-off-by: default avatarMiles Chen <miles.chen@mediatek.com>
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      43729e6f
    • Isaac J. Manjarres's avatar
      mm/usercopy: use memory range to be accessed for wraparound check · faf6760c
      Isaac J. Manjarres authored
      commit 95153169 upstream.
      
      Currently, when checking to see if accessing n bytes starting at address
      "ptr" will cause a wraparound in the memory addresses, the check in
      check_bogus_address() adds an extra byte, which is incorrect, as the
      range of addresses that will be accessed is [ptr, ptr + (n - 1)].
      
      This can lead to incorrectly detecting a wraparound in the memory
      address, when trying to read 4 KB from memory that is mapped to the the
      last possible page in the virtual address space, when in fact, accessing
      that range of memory would not cause a wraparound to occur.
      
      Use the memory range that will actually be accessed when considering if
      accessing a certain amount of bytes will cause the memory address to
      wrap around.
      
      Link: http://lkml.kernel.org/r/1564509253-23287-1-git-send-email-isaacm@codeaurora.org
      Fixes: f5509cc1 ("mm: Hardened usercopy")
      Signed-off-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
      Signed-off-by: default avatarIsaac J. Manjarres <isaacm@codeaurora.org>
      Co-developed-by: default avatarPrasad Sodagudi <psodagud@codeaurora.org>
      Reviewed-by: default avatarWilliam Kucharski <william.kucharski@oracle.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Trilok Soni <tsoni@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [kees: backport to v4.9]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      faf6760c
    • Gustavo A. R. Silva's avatar
      sh: kernel: hw_breakpoint: Fix missing break in switch statement · 694457ee
      Gustavo A. R. Silva authored
      commit 1ee1119d upstream.
      
      Add missing break statement in order to prevent the code from falling
      through to case SH_BREAKPOINT_WRITE.
      
      Fixes: 09a07294 ("sh: hw-breakpoints: Add preliminary support for SH-4A UBC.")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      694457ee
    • Suganath Prabu's avatar
      scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA · 718ce1eb
      Suganath Prabu authored
      commit df9a6061 upstream.
      
      Although SAS3 & SAS3.5 IT HBA controllers support 64-bit DMA addressing, as
      per hardware design, if DMA-able range contains all 64-bits
      set (0xFFFFFFFF-FFFFFFFF) then it results in a firmware fault.
      
      E.g. SGE's start address is 0xFFFFFFFF-FFFF000 and data length is 0x1000
      bytes. when HBA tries to DMA the data at 0xFFFFFFFF-FFFFFFFF location then
      HBA will fault the firmware.
      
      Driver will set 63-bit DMA mask to ensure the above address will not be
      used.
      
      Cc: <stable@vger.kernel.org> # 5.1.20+
      Signed-off-by: default avatarSuganath Prabu <suganath-prabu.subramani@broadcom.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      718ce1eb