1. 22 Nov, 2018 1 commit
  2. 21 Nov, 2018 1 commit
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-4.20-rc4' of... · 92b41928
      Linus Torvalds authored
      Merge tag 'riscv-for-linus-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
      
      Pull RISC-V fixes from Palmer Dabbelt:
       "This week is a bit bigger than I expected. That's my fault, as I
        missed a few patches while I was at Plumbers last week. We have:
      
         - A fix to a quite embarassing issue where raw_copy_to_user() was
           implemented with asm_copy_from_user() (and vice versa).
      
         - Improvements to our makefile to allow flat binaries to be
           generated.
      
         - A build fix that predeclares "struct module" at the top of
           <asm/module.h>, which triggers warnings later in that header.
      
         - The addition of our own <uapi/asm/unistd> header, which is
           necessary to align our stat ABI on 32-bit systems.
      
         - A fix to avoid printing a warning when the S or U bits are set in
           print_isa().
      
        I already have one patch in the queue for next week"
      
      * tag 'riscv-for-linus-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
        RISC-V: recognize S/U mode bits in print_isa
        riscv: add asm/unistd.h UAPI header
        riscv: fix warning in arch/riscv/include/asm/module.h
        RISC-V: Build flat and compressed kernel images
        RISC-V: Fix raw_copy_{to,from}_user()
      92b41928
  3. 20 Nov, 2018 7 commits
    • Linus Torvalds's avatar
      Merge tag 'mips_fixes_4.20_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · c8ce94b8
      Linus Torvalds authored
      Pull MIPS fixes from Paul Burton:
       "A few MIPS fixes for 4.20:
      
         - Re-enable the Cavium Octeon USB driver in its defconfig after it
           was accidentally removed back in 4.14.
      
         - Have early memblock allocations be performed bottom-up to more
           closely match the behaviour we used to have with bootmem, which
           seems a safer choice since we've seen fallout from the change made
           in the 4.20 merge window.
      
         - Simplify max_low_pfn calculation in the NUMA code for the Loongson3
           and SGI IP27 platforms to both clean up the code & ensure
           max_low_pfn has been set appropriately before it is used"
      
      * tag 'mips_fixes_4.20_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: Loongson3,SGI-IP27: Simplify max_low_pfn calculation
        MIPS: Let early memblock_alloc*() allocate memories bottom-up
        MIPS: OCTEON: cavium_octeon_defconfig: re-enable OCTEON USB driver
      c8ce94b8
    • Linus Torvalds's avatar
      Merge tag 'media/v4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 06e68fed
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
      
       - add a missing include at v4l2-controls uAPI header
      
       - minor kAPI update for the request API
      
       - some fixes at CEC core
      
       - use a lower minimum height for the virtual codec driver
      
       - cleanup a gcc warning due to the lack of a fall though markup
      
       - tc358743: Remove unnecessary self assignment
      
       - fix the V4L event subscription logic
      
       - docs: Document metadata format in struct v4l2_format
      
       - omap3isp and ipu3-cio2: fix unbinding logic
      
      * tag 'media/v4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        media: ipu3-cio2: Use cio2_queues_exit
        media: ipu3-cio2: Unregister device nodes first, then release resources
        media: omap3isp: Unregister media device as first
        media: docs: Document metadata format in struct v4l2_format
        media: v4l: event: Add subscription to list before calling "add" operation
        media: dm365_ipipeif: better annotate a fall though
        media: Rename vb2_m2m_request_queue -> v4l2_m2m_request_queue
        media: cec: increase debug level for 'queue full'
        media: cec: check for non-OK/NACK conditions while claiming a LA
        media: vicodec: lower minimum height to 360
        media: tc358743: Remove unnecessary self assignment
        media: v4l: fix uapi mpeg slice params definition
        v4l2-controls: add a missing include
      06e68fed
    • Patrick Stählin's avatar
      RISC-V: recognize S/U mode bits in print_isa · 5d8f81ba
      Patrick Stählin authored
      Removes the warning about an unsupported ISA when reading /proc/cpuinfo
      on QEMU. The "S" extension is not being returned as it is not accessible
      from userspace.
      Signed-off-by: default avatarPatrick Stählin <me@packi.ch>
      Signed-off-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      5d8f81ba
    • David Abdurachmanov's avatar
      riscv: add asm/unistd.h UAPI header · 27f8899d
      David Abdurachmanov authored
      Marcin Juszkiewicz reported issues while generating syscall table for riscv
      using 4.20-rc1. The patch refactors our unistd.h files to match some other
      architectures.
      
      - Add asm/unistd.h UAPI header, which has __ARCH_WANT_NEW_STAT only for 64-bit
      - Remove asm/syscalls.h UAPI header and merge to asm/unistd.h
      - Adjust kernel asm/unistd.h
      
      So now asm/unistd.h UAPI header should show all syscalls for riscv.
      
      Before this, Makefile simply put `#include <asm-generic/unistd.h>` into
      generated asm/unistd.h UAPI header thus user didn't see:
      
      - __NR_riscv_flush_icache
      - __NR_newfstatat
      - __NR_fstat
      
      which are supported by riscv kernel.
      Signed-off-by: default avatarDavid Abdurachmanov <david.abdurachmanov@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Fixes: 67314ec7 ("RISC-V: Request newstat syscalls")
      Signed-off-by: default avatarDavid Abdurachmanov <david.abdurachmanov@gmail.com>
      Acked-by: default avatarOlof Johansson <olof@lixom.net>
      Signed-off-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      27f8899d
    • David Abdurachmanov's avatar
      riscv: fix warning in arch/riscv/include/asm/module.h · 0138ebb9
      David Abdurachmanov authored
      Fixes warning: 'struct module' declared inside parameter list will not be
      visible outside of this definition or declaration
      Signed-off-by: default avatarDavid Abdurachmanov <david.abdurachmanov@gmail.com>
      Acked-by: default avatarOlof Johansson <olof@lixom.net>
      Signed-off-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      0138ebb9
    • Anup Patel's avatar
      RISC-V: Build flat and compressed kernel images · c0fbcd99
      Anup Patel authored
      This patch extends Linux RISC-V build system to build and install:
      Image - Flat uncompressed kernel image
      Image.gz - Flat and GZip compressed kernel image
      
      Quiet a few bootloaders (such as Uboot, UEFI, etc) are capable of
      booting flat and compressed kernel images. In case of Uboot, booting
      Image or Image.gz is achieved using bootm command.
      
      The flat and uncompressed kernel image (i.e. Image) is very useful
      in pre-silicon developent and testing because we can create back-door
      HEX files for RAM on FPGAs from Image.
      Signed-off-by: default avatarAnup Patel <anup@brainfault.org>
      Signed-off-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      c0fbcd99
    • Olof Johansson's avatar
      RISC-V: Fix raw_copy_{to,from}_user() · 21f70d4a
      Olof Johansson authored
      Sparse highlighted it, and appears to be a pure bug (from vs to).
      
      ./arch/riscv/include/asm/uaccess.h:403:35: warning: incorrect type in argument 1 (different address spaces)
      ./arch/riscv/include/asm/uaccess.h:403:39: warning: incorrect type in argument 2 (different address spaces)
      ./arch/riscv/include/asm/uaccess.h:409:37: warning: incorrect type in argument 1 (different address spaces)
      ./arch/riscv/include/asm/uaccess.h:409:41: warning: incorrect type in argument 2 (different address spaces)
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      21f70d4a
  4. 19 Nov, 2018 3 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f2ce1065
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix some potentially uninitialized variables and use-after-free in
          kvaser_usb can drier, from Jimmy Assarsson.
      
       2) Fix leaks in qed driver, from Denis Bolotin.
      
       3) Socket leak in l2tp, from Xin Long.
      
       4) RSS context allocation fix in bnxt_en from Michael Chan.
      
       5) Fix cxgb4 build errors, from Ganesh Goudar.
      
       6) Route leaks in ipv6 when removing exceptions, from Xin Long.
      
       7) Memory leak in IDR allocation handling of act_pedit, from Davide
          Caratti.
      
       8) Use-after-free of bridge vlan stats, from Nikolay Aleksandrov.
      
       9) When MTU is locked, do not force DF bit on ipv4 tunnels. From
          Sabrina Dubroca.
      
      10) When NAPI cached skb is reused, we must set it to the proper initial
          state which includes skb->pkt_type. From Eric Dumazet.
      
      11) Lockdep and non-linear SKB handling fix in tipc from Jon Maloy.
      
      12) Set RX queue properly in various tuntap receive paths, from Matthew
          Cover.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
        tuntap: fix multiqueue rx
        ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF
        tipc: don't assume linear buffer when reading ancillary data
        tipc: fix lockdep warning when reinitilaizing sockets
        net-gro: reset skb->pkt_type in napi_reuse_skb()
        tc-testing: tdc.py: Guard against lack of returncode in executed command
        tc-testing: tdc.py: ignore errors when decoding stdout/stderr
        ip_tunnel: don't force DF when MTU is locked
        MAINTAINERS: Add entry for CAKE qdisc
        net: bridge: fix vlan stats use-after-free on destruction
        socket: do a generic_file_splice_read when proto_ops has no splice_read
        net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs
        Revert "net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs"
        net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs
        net/sched: act_pedit: fix memory leak when IDR allocation fails
        net: lantiq: Fix returned value in case of error in 'xrx200_probe()'
        ipv6: fix a dst leak when removing its exception
        net: mvneta: Don't advertise 2.5G modes
        drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo
        net/mlx4: Fix UBSAN warning of signed integer overflow
        ...
      f2ce1065
    • Matthew Cover's avatar
      tuntap: fix multiqueue rx · 8ebebcba
      Matthew Cover authored
      When writing packets to a descriptor associated with a combined queue, the
      packets should end up on that queue.
      
      Before this change all packets written to any descriptor associated with a
      tap interface end up on rx-0, even when the descriptor is associated with a
      different queue.
      
      The rx traffic can be generated by either of the following.
        1. a simple tap program which spins up multiple queues and writes packets
           to each of the file descriptors
        2. tx from a qemu vm with a tap multiqueue netdev
      
      The queue for rx traffic can be observed by either of the following (done
      on the hypervisor in the qemu case).
        1. a simple netmap program which opens and reads from per-queue
           descriptors
        2. configuring RPS and doing per-cpu captures with rxtxcpu
      
      Alternatively, if you printk() the return value of skb_get_rx_queue() just
      before each instance of netif_receive_skb() in tun.c, you will get 65535
      for every skb.
      
      Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
      the association between descriptor and rx queue.
      Signed-off-by: default avatarMatthew Cover <matthew.cover@stackpath.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ebebcba
    • David Ahern's avatar
      ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF · 7ddacfa5
      David Ahern authored
      Preethi reported that PMTU discovery for UDP/raw applications is not
      working in the presence of VRF when the socket is not bound to a device.
      The problem is that ip6_sk_update_pmtu does not consider the L3 domain
      of the skb device if the socket is not bound. Update the function to
      set oif to the L3 master device if relevant.
      
      Fixes: ca254490 ("net: Add VRF support to IPv6 stack")
      Reported-by: default avatarPreethi Ramachandra <preethir@juniper.net>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ddacfa5
  5. 18 Nov, 2018 28 commits
    • Linus Torvalds's avatar
      Linux 4.20-rc3 · 9ff01193
      Linus Torvalds authored
      9ff01193
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 25e19c1f
      Linus Torvalds authored
      Pull libnvdimm fixes from Dan Williams:
       "A small batch of fixes for v4.20-rc3.
      
        The overflow continuation fix addresses something that has been broken
        for several releases. Arguably it could wait even longer, but it's a
        one line fix and this finishes the last of the known address range
        scrub bug reports. The revert addresses a lockdep regression. The unit
        tests are not critical to fix, but no reason to hold this fix back.
      
        Summary:
      
         - Address Range Scrub overflow continuation handling has been broken
           since it was initially merged. It was only recently that error
           injection and platform-BIOS support enabled this corner case to be
           exercised.
      
         - The recent attempt to provide more isolation for the kernel Address
           Range Scrub state machine from userapace initiated sessions
           triggers a lockdep report. Revert and try again at the next merge
           window.
      
         - Fix a kasan reported buffer overflow in libnvdimm unit test
           infrastrucutre (nfit_test)"
      
      * tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        Revert "acpi, nfit: Further restrict userspace ARS start requests"
        acpi, nfit: Fix ARS overflow continuation
        tools/testing/nvdimm: Fix the array size for dimm devices.
      25e19c1f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · c67a98c0
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "16 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/memblock.c: fix a typo in __next_mem_pfn_range() comments
        mm, page_alloc: check for max order in hot path
        scripts/spdxcheck.py: make python3 compliant
        tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset
        lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn
        mm/vmstat.c: fix NUMA statistics updates
        mm/gup.c: fix follow_page_mask() kerneldoc comment
        ocfs2: free up write context when direct IO failed
        scripts/faddr2line: fix location of start_kernel in comment
        mm: don't reclaim inodes with many attached pages
        mm, memory_hotplug: check zone_movable in has_unmovable_pages
        mm/swapfile.c: use kvzalloc for swap_info_struct allocation
        MAINTAINERS: update OMAP MMC entry
        hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444!
        kernel/sched/psi.c: simplify cgroup_move_task()
        z3fold: fix possible reclaim races
      c67a98c0
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 03582f33
      Linus Torvalds authored
      Pull scheduler fix from Ingo Molnar:
       "Fix an exec() related scalability/performance regression, which was
        caused by incorrectly calculating load and migrating tasks on exec()
        when they shouldn't be"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Fix cpu_util_wake() for 'execl' type workloads
      03582f33
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b53e27f6
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Fix uncore PMU enumeration for CofeeLake CPUs"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel/uncore: Support CoffeeLake 8th CBOX
        perf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake CPUs
      b53e27f6
    • Linus Torvalds's avatar
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 743a4863
      Linus Torvalds authored
      Pull EFI fixes from Ingo Molnar:
       "Misc fixes: two warning splat fixes, a leak fix and persistent memory
        allocation fixes for ARM"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: Permit calling efi_mem_reserve_persistent() from atomic context
        efi/arm: Defer persistent reservations until after paging_init()
        efi/arm/libstub: Pack FDT after populating it
        efi/arm: Revert deferred unmap of early memmap mapping
        efi: Fix debugobjects warning on 'efi_rts_work'
      743a4863
    • Linus Torvalds's avatar
      Merge branch 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm · cfaa9f02
      Linus Torvalds authored
      Pull ARM spectre updates from Russell King:
       "These are the currently known final bits that resolve the Spectre
        issues. big.Little systems used to be sufficiently identical in that
        there were no differences between individual CPUs in the system that
        mattered to the kernel. With the advent of the Spectre problem, the
        CPUs now have differences in how the workaround is applied.
      
        As a result of previous Spectre patches, these systems ended up
        reporting quite a lot of:
      
           "CPUx: Spectre v2: incorrect context switching function, system vulnerable"
      
        messages due to the action of the big.Little switcher causing the CPUs
        to be re-initialised regularly. This series resolves that issue by
        making the CPU vtable unique to each CPU.
      
        However, since this is used very early, before per-cpu is setup,
        per-cpu can't be used. We also have a problem that two of the methods
        are not called from preempt-safe paths, but thankfully these remain
        identical between all CPUs in the system. To make sure, we validate
        that these are identical during boot"
      
      * 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: spectre-v2: per-CPU vtables to work around big.Little systems
        ARM: add PROC_VTABLE and PROC_TABLE macros
        ARM: clean up per-processor check_bugs method call
        ARM: split out processor lookup
        ARM: make lookup_processor_type() non-__init
      cfaa9f02
    • Chen Chang's avatar
    • Michal Hocko's avatar
      mm, page_alloc: check for max order in hot path · c63ae43b
      Michal Hocko authored
      Konstantin has noticed that kvmalloc might trigger the following
      warning:
      
        WARNING: CPU: 0 PID: 6676 at mm/vmstat.c:986 __fragmentation_index+0x54/0x60
        [...]
        Call Trace:
         fragmentation_index+0x76/0x90
         compaction_suitable+0x4f/0xf0
         shrink_node+0x295/0x310
         node_reclaim+0x205/0x250
         get_page_from_freelist+0x649/0xad0
         __alloc_pages_nodemask+0x12a/0x2a0
         kmalloc_large_node+0x47/0x90
         __kmalloc_node+0x22b/0x2e0
         kvmalloc_node+0x3e/0x70
         xt_alloc_table_info+0x3a/0x80 [x_tables]
         do_ip6t_set_ctl+0xcd/0x1c0 [ip6_tables]
         nf_setsockopt+0x44/0x60
         SyS_setsockopt+0x6f/0xc0
         do_syscall_64+0x67/0x120
         entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      the problem is that we only check for an out of bound order in the slow
      path and the node reclaim might happen from the fast path already.  This
      is fixable by making sure that kvmalloc doesn't ever use kmalloc for
      requests that are larger than KMALLOC_MAX_SIZE but this also shows that
      the code is rather fragile.  A recent UBSAN report just underlines that
      by the following report
      
        UBSAN: Undefined behaviour in mm/page_alloc.c:3117:19
        shift exponent 51 is too large for 32-bit type 'int'
        CPU: 0 PID: 6520 Comm: syz-executor1 Not tainted 4.19.0-rc2 #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        Call Trace:
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0xd2/0x148 lib/dump_stack.c:113
         ubsan_epilogue+0x12/0x94 lib/ubsan.c:159
         __ubsan_handle_shift_out_of_bounds+0x2b6/0x30b lib/ubsan.c:425
         __zone_watermark_ok+0x2c7/0x400 mm/page_alloc.c:3117
         zone_watermark_fast mm/page_alloc.c:3216 [inline]
         get_page_from_freelist+0xc49/0x44c0 mm/page_alloc.c:3300
         __alloc_pages_nodemask+0x21e/0x640 mm/page_alloc.c:4370
         alloc_pages_current+0xcc/0x210 mm/mempolicy.c:2093
         alloc_pages include/linux/gfp.h:509 [inline]
         __get_free_pages+0x12/0x60 mm/page_alloc.c:4414
         dma_mem_alloc+0x36/0x50 arch/x86/include/asm/floppy.h:156
         raw_cmd_copyin drivers/block/floppy.c:3159 [inline]
         raw_cmd_ioctl drivers/block/floppy.c:3206 [inline]
         fd_locked_ioctl+0xa00/0x2c10 drivers/block/floppy.c:3544
         fd_ioctl+0x40/0x60 drivers/block/floppy.c:3571
         __blkdev_driver_ioctl block/ioctl.c:303 [inline]
         blkdev_ioctl+0xb3c/0x1a30 block/ioctl.c:601
         block_ioctl+0x105/0x150 fs/block_dev.c:1883
         vfs_ioctl fs/ioctl.c:46 [inline]
         do_vfs_ioctl+0x1c0/0x1150 fs/ioctl.c:687
         ksys_ioctl+0x9e/0xb0 fs/ioctl.c:702
         __do_sys_ioctl fs/ioctl.c:709 [inline]
         __se_sys_ioctl fs/ioctl.c:707 [inline]
         __x64_sys_ioctl+0x7e/0xc0 fs/ioctl.c:707
         do_syscall_64+0xc4/0x510 arch/x86/entry/common.c:290
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Note that this is not a kvmalloc path.  It is just that the fast path
      really depends on having sanitzed order as well.  Therefore move the
      order check to the fast path.
      
      Link: http://lkml.kernel.org/r/20181113094305.GM15120@dhcp22.suse.czSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Reported-by: default avatarKyungtae Kim <kt0755@gmail.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Byoungyoung Lee <lifeasageek@gmail.com>
      Cc: "Dae R. Jeong" <threeearcat@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c63ae43b
    • Uwe Kleine-König's avatar
      scripts/spdxcheck.py: make python3 compliant · 6f4d29df
      Uwe Kleine-König authored
      Without this change the following happens when using Python3 (3.6.6):
      
      	$ echo "GPL-2.0" | python3 scripts/spdxcheck.py -
      	FAIL: 'str' object has no attribute 'decode'
      	Traceback (most recent call last):
      	  File "scripts/spdxcheck.py", line 253, in <module>
      	    parser.parse_lines(sys.stdin, args.maxlines, '-')
      	  File "scripts/spdxcheck.py", line 171, in parse_lines
      	    line = line.decode(locale.getpreferredencoding(False), errors='ignore')
      	AttributeError: 'str' object has no attribute 'decode'
      
      So as the line is already a string, there is no need to decode it and
      the line can be dropped.
      
      /usr/bin/python on Arch is Python 3.  So this would indeed be worth
      going into 4.19.
      
      Link: http://lkml.kernel.org/r/20181023070802.22558-1-u.kleine-koenig@pengutronix.deSigned-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Joe Perches <joe@perches.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6f4d29df
    • Yufen Yu's avatar
      tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset · 1a413646
      Yufen Yu authored
      Other filesystems such as ext4, f2fs and ubifs all return ENXIO when
      lseek (SEEK_DATA or SEEK_HOLE) requests a negative offset.
      
      man 2 lseek says
      
      :      EINVAL whence  is  not  valid.   Or: the resulting file offset would be
      :             negative, or beyond the end of a seekable device.
      :
      :      ENXIO  whence is SEEK_DATA or SEEK_HOLE, and the file offset is  beyond
      :             the end of the file.
      
      Make tmpfs return ENXIO under these circumstances as well.  After this,
      tmpfs also passes xfstests's generic/448.
      
      [akpm@linux-foundation.org: rewrite changelog]
      Link: http://lkml.kernel.org/r/1540434176-14349-1-git-send-email-yuyufen@huawei.comSigned-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: William Kucharski <william.kucharski@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1a413646
    • Arnd Bergmann's avatar
      lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn · 1c23b410
      Arnd Bergmann authored
      gcc-8 complains about the prototype for this function:
      
        lib/ubsan.c:432:1: error: ignoring attribute 'noreturn' in declaration of a built-in function '__ubsan_handle_builtin_unreachable' because it conflicts with attribute 'const' [-Werror=attributes]
      
      This is actually a GCC's bug. In GCC internals
      __ubsan_handle_builtin_unreachable() declared with both 'noreturn' and
      'const' attributes instead of only 'noreturn':
      
         https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84210
      
      Workaround this by removing the noreturn attribute.
      
      [aryabinin: add information about GCC bug in changelog]
      Link: http://lkml.kernel.org/r/20181107144516.4587-1-aryabinin@virtuozzo.comSigned-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Acked-by: default avatarOlof Johansson <olof@lixom.net>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1c23b410
    • Janne Huttunen's avatar
      mm/vmstat.c: fix NUMA statistics updates · 13c9aaf7
      Janne Huttunen authored
      Scan through the whole array to see if an update is needed.  While we're
      at it, use sizeof() to be safe against any possible type changes in the
      future.
      
      The bug here is that we wouldn't sync per-cpu counters into global ones
      if there was an update of numa_stats for higher cpus.  Highly
      theoretical one though because it is much more probable that zone_stats
      are updated so we would refresh anyway.  So I wouldn't bother to mark
      this for stable, yet something nice to fix.
      
      [mhocko@suse.com: changelog enhancement]
      Link: http://lkml.kernel.org/r/1541601517-17282-1-git-send-email-janne.huttunen@nokia.com
      Fixes: 1d90ca89 ("mm: update NUMA counter threshold size")
      Signed-off-by: default avatarJanne Huttunen <janne.huttunen@nokia.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      13c9aaf7
    • Mike Rapoport's avatar
      mm/gup.c: fix follow_page_mask() kerneldoc comment · 78179556
      Mike Rapoport authored
      Commit df06b37f ("mm/gup: cache dev_pagemap while pinning pages")
      modified the signature of follow_page_mask() but left the parameter
      description behind.
      
      Update the description to make the code and comments agree again.
      
      While at it, update formatting of the return value description to match
      Documentation/doc-guide/kernel-doc.rst guidelines.
      
      Link: http://lkml.kernel.org/r/1541603316-27832-1-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      78179556
    • Wengang Wang's avatar
      ocfs2: free up write context when direct IO failed · 5040f8df
      Wengang Wang authored
      The write context should also be freed even when direct IO failed.
      Otherwise a memory leak is introduced and entries remain in
      oi->ip_unwritten_list causing the following BUG later in unlink path:
      
        ERROR: bug expression: !list_empty(&oi->ip_unwritten_list)
        ERROR: Clear inode of 215043, inode has unwritten extents
        ...
        Call Trace:
        ? __set_current_blocked+0x42/0x68
        ocfs2_evict_inode+0x91/0x6a0 [ocfs2]
        ? bit_waitqueue+0x40/0x33
        evict+0xdb/0x1af
        iput+0x1a2/0x1f7
        do_unlinkat+0x194/0x28f
        SyS_unlinkat+0x1b/0x2f
        do_syscall_64+0x79/0x1ae
        entry_SYSCALL_64_after_hwframe+0x151/0x0
      
      This patch also logs, with frequency limit, direct IO failures.
      
      Link: http://lkml.kernel.org/r/20181102170632.25921-1-wen.gang.wang@oracle.comSigned-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Reviewed-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarChangwei Ge <ge.changwei@h3c.com>
      Reviewed-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5040f8df
    • Randy Dunlap's avatar
    • Roman Gushchin's avatar
      mm: don't reclaim inodes with many attached pages · a76cf1a4
      Roman Gushchin authored
      Spock reported that commit 172b06c3 ("mm: slowly shrink slabs with a
      relatively small number of objects") leads to a regression on his setup:
      periodically the majority of the pagecache is evicted without an obvious
      reason, while before the change the amount of free memory was balancing
      around the watermark.
      
      The reason behind is that the mentioned above change created some
      minimal background pressure on the inode cache.  The problem is that if
      an inode is considered to be reclaimed, all belonging pagecache page are
      stripped, no matter how many of them are there.  So, if a huge
      multi-gigabyte file is cached in the memory, and the goal is to reclaim
      only few slab objects (unused inodes), we still can eventually evict all
      gigabytes of the pagecache at once.
      
      The workload described by Spock has few large non-mapped files in the
      pagecache, so it's especially noticeable.
      
      To solve the problem let's postpone the reclaim of inodes, which have
      more than 1 attached page.  Let's wait until the pagecache pages will be
      evicted naturally by scanning the corresponding LRU lists, and only then
      reclaim the inode structure.
      
      Link: http://lkml.kernel.org/r/20181023164302.20436-1-guro@fb.comSigned-off-by: default avatarRoman Gushchin <guro@fb.com>
      Reported-by: default avatarSpock <dairinin@gmail.com>
      Tested-by: default avatarSpock <dairinin@gmail.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: <stable@vger.kernel.org>	[4.19.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a76cf1a4
    • Michal Hocko's avatar
      mm, memory_hotplug: check zone_movable in has_unmovable_pages · 9d789999
      Michal Hocko authored
      Page state checks are racy.  Under a heavy memory workload (e.g.  stress
      -m 200 -t 2h) it is quite easy to hit a race window when the page is
      allocated but its state is not fully populated yet.  A debugging patch to
      dump the struct page state shows
      
        has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
        page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
        flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)
      
      Note that the state has been checked for both PageLRU and PageSwapBacked
      already.  Closing this race completely would require some sort of retry
      logic.  This can be tricky and error prone (think of potential endless
      or long taking loops).
      
      Workaround this problem for movable zones at least.  Such a zone should
      only contain movable pages.  Commit 15c30bc0 ("mm, memory_hotplug:
      make has_unmovable_pages more robust") has told us that this is not
      strictly true though.  Bootmem pages should be marked reserved though so
      we can move the original check after the PageReserved check.  Pages from
      other zones are still prone to races but we even do not pretend that
      memory hotremove works for those so pre-mature failure doesn't hurt that
      much.
      
      Link: http://lkml.kernel.org/r/20181106095524.14629-1-mhocko@kernel.org
      Fixes: 15c30bc0 ("mm, memory_hotplug: make has_unmovable_pages more robust")
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarBaoquan He <bhe@redhat.com>
      Tested-by: default avatarBaoquan He <bhe@redhat.com>
      Acked-by: default avatarBaoquan He <bhe@redhat.com>
      Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
      Acked-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d789999
    • Vasily Averin's avatar
      mm/swapfile.c: use kvzalloc for swap_info_struct allocation · 873d7bcf
      Vasily Averin authored
      Commit a2468cc9 ("swap: choose swap device according to numa node")
      changed 'avail_lists' field of 'struct swap_info_struct' to an array.
      In popular linux distros it increased size of swap_info_struct up to 40
      Kbytes and now swap_info_struct allocation requires order-4 page.
      Switch to kvzmalloc allows to avoid unexpected allocation failures.
      
      Link: http://lkml.kernel.org/r/fc23172d-3c75-21e2-d551-8b1808cbe593@virtuozzo.com
      Fixes: a2468cc9 ("swap: choose swap device according to numa node")
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Acked-by: default avatarAaron Lu <aaron.lu@intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      873d7bcf
    • Aaro Koskinen's avatar
      MAINTAINERS: update OMAP MMC entry · f341e16f
      Aaro Koskinen authored
      Jarkko's e-mail address hasn't worked for a long time.  We still want to
      keep this driver working as it is critical for some of the OMAP boards.
      I use and test this driver frequently, so change myself as a maintainer
      with "Odd Fixes" status.
      
      Link: http://lkml.kernel.org/r/20181106222750.12939-1-aaro.koskinen@iki.fiSigned-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Acked-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f341e16f
    • Mike Kravetz's avatar
      hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444! · 5e41540c
      Mike Kravetz authored
      This bug has been experienced several times by the Oracle DB team.  The
      BUG is in remove_inode_hugepages() as follows:
      
      	/*
      	 * If page is mapped, it was faulted in after being
      	 * unmapped in caller.  Unmap (again) now after taking
      	 * the fault mutex.  The mutex will prevent faults
      	 * until we finish removing the page.
      	 *
      	 * This race can only happen in the hole punch case.
      	 * Getting here in a truncate operation is a bug.
      	 */
      	if (unlikely(page_mapped(page))) {
      		BUG_ON(truncate_op);
      
      In this case, the elevated map count is not the result of a race.
      Rather it was incorrectly incremented as the result of a bug in the huge
      pmd sharing code.  Consider the following:
      
       - Process A maps a hugetlbfs file of sufficient size and alignment
         (PUD_SIZE) that a pmd page could be shared.
      
       - Process B maps the same hugetlbfs file with the same size and
         alignment such that a pmd page is shared.
      
       - Process B then calls mprotect() to change protections for the mapping
         with the shared pmd. As a result, the pmd is 'unshared'.
      
       - Process B then calls mprotect() again to chage protections for the
         mapping back to their original value. pmd remains unshared.
      
       - Process B then forks and process C is created. During the fork
         process, we do dup_mm -> dup_mmap -> copy_page_range to copy page
         tables. Copying page tables for hugetlb mappings is done in the
         routine copy_hugetlb_page_range.
      
      In copy_hugetlb_page_range(), the destination pte is obtained by:
      
      	dst_pte = huge_pte_alloc(dst, addr, sz);
      
      If pmd sharing is possible, the returned pointer will be to a pte in an
      existing page table.  In the situation above, process C could share with
      either process A or process B.  Since process A is first in the list,
      the returned pte is a pointer to a pte in process A's page table.
      
      However, the check for pmd sharing in copy_hugetlb_page_range is:
      
      	/* If the pagetables are shared don't copy or take references */
      	if (dst_pte == src_pte)
      		continue;
      
      Since process C is sharing with process A instead of process B, the
      above test fails.  The code in copy_hugetlb_page_range which follows
      assumes dst_pte points to a huge_pte_none pte.  It copies the pte entry
      from src_pte to dst_pte and increments this map count of the associated
      page.  This is how we end up with an elevated map count.
      
      To solve, check the dst_pte entry for huge_pte_none.  If !none, this
      implies PMD sharing so do not copy.
      
      Link: http://lkml.kernel.org/r/20181105212315.14125-1-mike.kravetz@oracle.com
      Fixes: c5c99429 ("fix hugepages leak due to pagetable page sharing")
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Prakash Sangappa <prakash.sangappa@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5e41540c
    • Olof Johansson's avatar
      kernel/sched/psi.c: simplify cgroup_move_task() · 8fcb2312
      Olof Johansson authored
      The existing code triggered an invalid warning about 'rq' possibly being
      used uninitialized.  Instead of doing the silly warning suppression by
      initializa it to NULL, refactor the code to bail out early instead.
      
      Warning was:
      
        kernel/sched/psi.c: In function `cgroup_move_task':
        kernel/sched/psi.c:639:13: warning: `rq' may be used uninitialized in this function [-Wmaybe-uninitialized]
      
      Link: http://lkml.kernel.org/r/20181103183339.8669-1-olof@lixom.net
      Fixes: 2ce7135a ("psi: cgroup support")
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8fcb2312
    • Vitaly Wool's avatar
      z3fold: fix possible reclaim races · ca0246bb
      Vitaly Wool authored
      Reclaim and free can race on an object which is basically fine but in
      order for reclaim to be able to map "freed" object we need to encode
      object length in the handle.  handle_to_chunks() is then introduced to
      extract object length from a handle and use it during mapping.
      
      Moreover, to avoid racing on a z3fold "headless" page release, we should
      not try to free that page in z3fold_free() if the reclaim bit is set.
      Also, in the unlikely case of trying to reclaim a page being freed, we
      should not proceed with that page.
      
      While at it, fix the page accounting in reclaim function.
      
      This patch supersedes "[PATCH] z3fold: fix reclaim lock-ups".
      
      Link: http://lkml.kernel.org/r/20181105162225.74e8837d03583a9b707cf559@gmail.comSigned-off-by: default avatarVitaly Wool <vitaly.vul@sony.com>
      Signed-off-by: default avatarJongseok Kim <ks77sj@gmail.com>
      Reported-by-by: default avatarJongseok Kim <ks77sj@gmail.com>
      Reviewed-by: default avatarSnild Dolkow <snild@sony.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ca0246bb
    • Jon Maloy's avatar
      tipc: don't assume linear buffer when reading ancillary data · 1c1274a5
      Jon Maloy authored
      The code for reading ancillary data from a received buffer is assuming
      the buffer is linear. To make this assumption true we have to linearize
      the buffer before message data is read.
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1c1274a5
    • Jon Maloy's avatar
      tipc: fix lockdep warning when reinitilaizing sockets · adba75be
      Jon Maloy authored
      We get the following warning:
      
      [   47.926140] 32-bit node address hash set to 2010a0a
      [   47.927202]
      [   47.927433] ================================
      [   47.928050] WARNING: inconsistent lock state
      [   47.928661] 4.19.0+ #37 Tainted: G            E
      [   47.929346] --------------------------------
      [   47.929954] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [   47.930116] swapper/3/0 [HC0[0]:SC1[3]:HE1:SE0] takes:
      [   47.930116] 00000000af8bc31e (&(&ht->lock)->rlock){+.?.}, at: rhashtable_walk_enter+0x36/0xb0
      [   47.930116] {SOFTIRQ-ON-W} state was registered at:
      [   47.930116]   _raw_spin_lock+0x29/0x60
      [   47.930116]   rht_deferred_worker+0x556/0x810
      [   47.930116]   process_one_work+0x1f5/0x540
      [   47.930116]   worker_thread+0x64/0x3e0
      [   47.930116]   kthread+0x112/0x150
      [   47.930116]   ret_from_fork+0x3a/0x50
      [   47.930116] irq event stamp: 14044
      [   47.930116] hardirqs last  enabled at (14044): [<ffffffff9a07fbba>] __local_bh_enable_ip+0x7a/0xf0
      [   47.938117] hardirqs last disabled at (14043): [<ffffffff9a07fb81>] __local_bh_enable_ip+0x41/0xf0
      [   47.938117] softirqs last  enabled at (14028): [<ffffffff9a0803ee>] irq_enter+0x5e/0x60
      [   47.938117] softirqs last disabled at (14029): [<ffffffff9a0804a5>] irq_exit+0xb5/0xc0
      [   47.938117]
      [   47.938117] other info that might help us debug this:
      [   47.938117]  Possible unsafe locking scenario:
      [   47.938117]
      [   47.938117]        CPU0
      [   47.938117]        ----
      [   47.938117]   lock(&(&ht->lock)->rlock);
      [   47.938117]   <Interrupt>
      [   47.938117]     lock(&(&ht->lock)->rlock);
      [   47.938117]
      [   47.938117]  *** DEADLOCK ***
      [   47.938117]
      [   47.938117] 2 locks held by swapper/3/0:
      [   47.938117]  #0: 0000000062c64f90 ((&d->timer)){+.-.}, at: call_timer_fn+0x5/0x280
      [   47.938117]  #1: 00000000ee39619c (&(&d->lock)->rlock){+.-.}, at: tipc_disc_timeout+0xc8/0x540 [tipc]
      [   47.938117]
      [   47.938117] stack backtrace:
      [   47.938117] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E     4.19.0+ #37
      [   47.938117] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [   47.938117] Call Trace:
      [   47.938117]  <IRQ>
      [   47.938117]  dump_stack+0x5e/0x8b
      [   47.938117]  print_usage_bug+0x1ed/0x1ff
      [   47.938117]  mark_lock+0x5b5/0x630
      [   47.938117]  __lock_acquire+0x4c0/0x18f0
      [   47.938117]  ? lock_acquire+0xa6/0x180
      [   47.938117]  lock_acquire+0xa6/0x180
      [   47.938117]  ? rhashtable_walk_enter+0x36/0xb0
      [   47.938117]  _raw_spin_lock+0x29/0x60
      [   47.938117]  ? rhashtable_walk_enter+0x36/0xb0
      [   47.938117]  rhashtable_walk_enter+0x36/0xb0
      [   47.938117]  tipc_sk_reinit+0xb0/0x410 [tipc]
      [   47.938117]  ? mark_held_locks+0x6f/0x90
      [   47.938117]  ? __local_bh_enable_ip+0x7a/0xf0
      [   47.938117]  ? lockdep_hardirqs_on+0x20/0x1a0
      [   47.938117]  tipc_net_finalize+0xbf/0x180 [tipc]
      [   47.938117]  tipc_disc_timeout+0x509/0x540 [tipc]
      [   47.938117]  ? call_timer_fn+0x5/0x280
      [   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
      [   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
      [   47.938117]  call_timer_fn+0xa1/0x280
      [   47.938117]  ? tipc_disc_msg_xmit.isra.19+0xa0/0xa0 [tipc]
      [   47.938117]  run_timer_softirq+0x1f2/0x4d0
      [   47.938117]  __do_softirq+0xfc/0x413
      [   47.938117]  irq_exit+0xb5/0xc0
      [   47.938117]  smp_apic_timer_interrupt+0xac/0x210
      [   47.938117]  apic_timer_interrupt+0xf/0x20
      [   47.938117]  </IRQ>
      [   47.938117] RIP: 0010:default_idle+0x1c/0x140
      [   47.938117] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 54 55 53 65 8b 2d d8 2b 74 65 0f 1f 44 00 00 e8 c6 2c 8b ff fb f4 <65> 8b 2d c5 2b 74 65 0f 1f 44 00 00 5b 5d 41 5c c3 65 8b 05 b4 2b
      [   47.938117] RSP: 0018:ffffaf6ac0207ec8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
      [   47.938117] RAX: ffff8f5b3735e200 RBX: 0000000000000003 RCX: 0000000000000001
      [   47.938117] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8f5b3735e200
      [   47.938117] RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000000
      [   47.938117] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      [   47.938117] R13: 0000000000000000 R14: ffff8f5b3735e200 R15: ffff8f5b3735e200
      [   47.938117]  ? default_idle+0x1a/0x140
      [   47.938117]  do_idle+0x1bc/0x280
      [   47.938117]  cpu_startup_entry+0x19/0x20
      [   47.938117]  start_secondary+0x187/0x1c0
      [   47.938117]  secondary_startup_64+0xa4/0xb0
      
      The reason seems to be that tipc_net_finalize()->tipc_sk_reinit() is
      calling the function rhashtable_walk_enter() within a timer interrupt.
      We fix this by executing tipc_net_finalize() in work queue context.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adba75be
    • Eric Dumazet's avatar
      net-gro: reset skb->pkt_type in napi_reuse_skb() · 33d9a2c7
      Eric Dumazet authored
      eth_type_trans() assumes initial value for skb->pkt_type
      is PACKET_HOST.
      
      This is indeed the value right after a fresh skb allocation.
      
      However, it is possible that GRO merged a packet with a different
      value (like PACKET_OTHERHOST in case macvlan is used), so
      we need to make sure napi->skb will have pkt_type set back to
      PACKET_HOST.
      
      Otherwise, valid packets might be dropped by the stack because
      their pkt_type is not PACKET_HOST.
      
      napi_reuse_skb() was added in commit 96e93eab ("gro: Add
      internal interfaces for VLAN"), but this bug always has
      been there.
      
      Fixes: 96e93eab ("gro: Add internal interfaces for VLAN")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33d9a2c7
    • David S. Miller's avatar
      Merge branch 'tdc-fixes' · 5396527f
      David S. Miller authored
      Lucas Bates says:
      
      ====================
      Prevent uncaught exceptions in tdc
      
      This patch series addresses two potential bugs in tdc that can
      cause exceptions to be raised in certain circumstances.  These
      exceptions are generally not handled, so instead we will prevent
      them from being raised.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5396527f
    • Brenda J. Butler's avatar
      tc-testing: tdc.py: Guard against lack of returncode in executed command · c6cecf4a
      Brenda J. Butler authored
      Add some defensive coding in case one of the subprocesses created by tdc
      returns nothing. If no object is returned from exec_cmd, then tdc will
      halt with an unhandled exception.
      Signed-off-by: default avatarBrenda J. Butler <bjb@mojatatu.com>
      Signed-off-by: default avatarLucas Bates <lucasb@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6cecf4a