1. 27 Oct, 2017 1 commit
    • Hans de Goede's avatar
      USB: devio: Revert "USB: devio: Don't corrupt user memory" · abe43c97
      Hans de Goede authored
      commit 845d584f upstream.
      
      Taking the uurb->buffer_length userspace passes in as a maximum for the
      actual urbs transfer_buffer_length causes 2 serious issues:
      
      1) It breaks isochronous support for all userspace apps using libusb,
         as existing libusb versions pass in 0 for uurb->buffer_length,
         relying on the kernel using the lenghts of the usbdevfs_iso_packet_desc
         descriptors passed in added together as buffer length.
      
         This for example causes redirection of USB audio and Webcam's into
         virtual machines using qemu-kvm to no longer work. This is a userspace
         ABI break and as such must be reverted.
      
         Note that the original commit does not protect other users / the
         kernels memory, it only stops the userspace process making the call
         from shooting itself in the foot.
      
      2) It may cause the kernel to program host controllers to DMA over random
         memory. Just as the devio code used to only look at the iso_packet_desc
         lenghts, the host drivers do the same, relying on the submitter of the
         urbs to make sure the entire buffer is large enough and not checking
         transfer_buffer_length.
      
         But the "USB: devio: Don't corrupt user memory" commit now takes the
         userspace provided uurb->buffer_length for the buffer-size while copying
         over the user-provided iso_packet_desc lengths 1:1, allowing the user
         to specify a small buffer size while programming the host controller to
         dma a lot more data.
      
         (Atleast the ohci, uhci, xhci and fhci drivers do not check
          transfer_buffer_length for isoc transfers.)
      
      This reverts commit fa1ed74e ("USB: devio: Don't corrupt user memory")
      fixing both these issues.
      
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abe43c97
  2. 21 Oct, 2017 39 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.4.94 · af9a9a7b
      Greg Kroah-Hartman authored
      af9a9a7b
    • Greg Kroah-Hartman's avatar
      Revert "tty: goldfish: Fix a parameter of a call to free_irq" · 401231d0
      Greg Kroah-Hartman authored
      This reverts commit 01b3db29 which is
      commit 1a5c2d1d upstream.
      
      Ben writes:
      	This fixes a bug introduced in 4.6 by commit 465893e1 "tty:
      	goldfish: support platform_device with id -1".  For earlier
      	kernel versions, it *introduces* a bug.
      
      So let's drop it.
      Reported-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Sasha Levin <alexander.levin@verizon.com>
      Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
      401231d0
    • Arnd Bergmann's avatar
      cpufreq: CPPC: add ACPI_PROCESSOR dependency · cdbbea78
      Arnd Bergmann authored
      
      [ Upstream commit a578884f ]
      
      Without the Kconfig dependency, we can get this warning:
      
      warning: ACPI_CPPC_CPUFREQ selects ACPI_CPPC_LIB which has unmet direct dependencies (ACPI && ACPI_PROCESSOR)
      
      Fixes: 5477fb3b (ACPI / CPPC: Add a CPUFreq driver for use with CPPC)
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cdbbea78
    • Kinglong Mee's avatar
      nfsd/callback: Cleanup callback cred on shutdown · c2c6f43e
      Kinglong Mee authored
      
      [ Upstream commit f7d1ddbe ]
      
      The rpccred gotten from rpc_lookup_machine_cred() should be put when
      state is shutdown.
      Signed-off-by: default avatarKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c2c6f43e
    • Varun Prakash's avatar
      target/iscsi: Fix unsolicited data seq_end_offset calculation · 429a4ac5
      Varun Prakash authored
      
      [ Upstream commit 4d65491c ]
      
      In case of unsolicited data for the first sequence
      seq_end_offset must be set to minimum of total data length
      and FirstBurstLength, so do not add cmd->write_data_done
      to the min of total data length and FirstBurstLength.
      
      This patch avoids that with ImmediateData=Yes, InitialR2T=No,
      MaxXmitDataSegmentLength < FirstBurstLength that a WRITE command
      with IO size above FirstBurstLength triggers sequence error
      messages, for example
      
      Set following parameters on target (linux-4.8.12)
      ImmediateData = Yes
      InitialR2T = No
      MaxXmitDataSegmentLength = 8k
      FirstBurstLength = 64k
      
      Log in from Open iSCSI initiator and execute
      dd if=/dev/zero of=/dev/sdb bs=128k count=1 oflag=direct
      
      Error messages on target
      Command ITT: 0x00000035 with Offset: 65536, Length: 8192 outside
      of Sequence 73728:131072 while DataSequenceInOrder=Yes.
      Command ITT: 0x00000035, received DataSN: 0x00000001 higher than
      expected 0x00000000.
      Unable to perform within-command recovery while ERL=0.
      Signed-off-by: default avatarVarun Prakash <varun@chelsio.com>
      [ bvanassche: Use min() instead of open-coding it / edited patch description ]
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      429a4ac5
    • Dmitry V. Levin's avatar
      uapi: fix linux/mroute6.h userspace compilation errors · 823ba64c
      Dmitry V. Levin authored
      
      [ Upstream commit 72aa107d ]
      
      Include <linux/in6.h> to fix the following linux/mroute6.h userspace
      compilation errors:
      
      /usr/include/linux/mroute6.h:80:22: error: field 'mf6cc_origin' has incomplete type
        struct sockaddr_in6 mf6cc_origin;  /* Origin of mcast */
      /usr/include/linux/mroute6.h:81:22: error: field 'mf6cc_mcastgrp' has incomplete type
        struct sockaddr_in6 mf6cc_mcastgrp;  /* Group in question */
      /usr/include/linux/mroute6.h:91:22: error: field 'src' has incomplete type
        struct sockaddr_in6 src;
      /usr/include/linux/mroute6.h:92:22: error: field 'grp' has incomplete type
        struct sockaddr_in6 grp;
      /usr/include/linux/mroute6.h:132:18: error: field 'im6_src' has incomplete type
        struct in6_addr im6_src, im6_dst;
      /usr/include/linux/mroute6.h:132:27: error: field 'im6_dst' has incomplete type
        struct in6_addr im6_src, im6_dst;
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      823ba64c
    • Dmitry V. Levin's avatar
      uapi: fix linux/rds.h userspace compilation errors · 028a4198
      Dmitry V. Levin authored
      
      [ Upstream commit feb0869d ]
      
      Consistently use types from linux/types.h to fix the following
      linux/rds.h userspace compilation errors:
      
      /usr/include/linux/rds.h:106:2: error: unknown type name 'uint8_t'
        uint8_t name[32];
      /usr/include/linux/rds.h:107:2: error: unknown type name 'uint64_t'
        uint64_t value;
      /usr/include/linux/rds.h:117:2: error: unknown type name 'uint64_t'
        uint64_t next_tx_seq;
      /usr/include/linux/rds.h:118:2: error: unknown type name 'uint64_t'
        uint64_t next_rx_seq;
      /usr/include/linux/rds.h:121:2: error: unknown type name 'uint8_t'
        uint8_t transport[TRANSNAMSIZ];  /* null term ascii */
      /usr/include/linux/rds.h:122:2: error: unknown type name 'uint8_t'
        uint8_t flags;
      /usr/include/linux/rds.h:129:2: error: unknown type name 'uint64_t'
        uint64_t seq;
      /usr/include/linux/rds.h:130:2: error: unknown type name 'uint32_t'
        uint32_t len;
      /usr/include/linux/rds.h:135:2: error: unknown type name 'uint8_t'
        uint8_t flags;
      /usr/include/linux/rds.h:139:2: error: unknown type name 'uint32_t'
        uint32_t sndbuf;
      /usr/include/linux/rds.h:144:2: error: unknown type name 'uint32_t'
        uint32_t rcvbuf;
      /usr/include/linux/rds.h:145:2: error: unknown type name 'uint64_t'
        uint64_t inum;
      /usr/include/linux/rds.h:153:2: error: unknown type name 'uint64_t'
        uint64_t       hdr_rem;
      /usr/include/linux/rds.h:154:2: error: unknown type name 'uint64_t'
        uint64_t       data_rem;
      /usr/include/linux/rds.h:155:2: error: unknown type name 'uint32_t'
        uint32_t       last_sent_nxt;
      /usr/include/linux/rds.h:156:2: error: unknown type name 'uint32_t'
        uint32_t       last_expected_una;
      /usr/include/linux/rds.h:157:2: error: unknown type name 'uint32_t'
        uint32_t       last_seen_una;
      /usr/include/linux/rds.h:164:2: error: unknown type name 'uint8_t'
        uint8_t  src_gid[RDS_IB_GID_LEN];
      /usr/include/linux/rds.h:165:2: error: unknown type name 'uint8_t'
        uint8_t  dst_gid[RDS_IB_GID_LEN];
      /usr/include/linux/rds.h:167:2: error: unknown type name 'uint32_t'
        uint32_t max_send_wr;
      /usr/include/linux/rds.h:168:2: error: unknown type name 'uint32_t'
        uint32_t max_recv_wr;
      /usr/include/linux/rds.h:169:2: error: unknown type name 'uint32_t'
        uint32_t max_send_sge;
      /usr/include/linux/rds.h:170:2: error: unknown type name 'uint32_t'
        uint32_t rdma_mr_max;
      /usr/include/linux/rds.h:171:2: error: unknown type name 'uint32_t'
        uint32_t rdma_mr_size;
      /usr/include/linux/rds.h:212:9: error: unknown type name 'uint64_t'
       typedef uint64_t rds_rdma_cookie_t;
      /usr/include/linux/rds.h:215:2: error: unknown type name 'uint64_t'
        uint64_t addr;
      /usr/include/linux/rds.h:216:2: error: unknown type name 'uint64_t'
        uint64_t bytes;
      /usr/include/linux/rds.h:221:2: error: unknown type name 'uint64_t'
        uint64_t cookie_addr;
      /usr/include/linux/rds.h:222:2: error: unknown type name 'uint64_t'
        uint64_t flags;
      /usr/include/linux/rds.h:228:2: error: unknown type name 'uint64_t'
        uint64_t  cookie_addr;
      /usr/include/linux/rds.h:229:2: error: unknown type name 'uint64_t'
        uint64_t  flags;
      /usr/include/linux/rds.h:234:2: error: unknown type name 'uint64_t'
        uint64_t flags;
      /usr/include/linux/rds.h:240:2: error: unknown type name 'uint64_t'
        uint64_t local_vec_addr;
      /usr/include/linux/rds.h:241:2: error: unknown type name 'uint64_t'
        uint64_t nr_local;
      /usr/include/linux/rds.h:242:2: error: unknown type name 'uint64_t'
        uint64_t flags;
      /usr/include/linux/rds.h:243:2: error: unknown type name 'uint64_t'
        uint64_t user_token;
      /usr/include/linux/rds.h:248:2: error: unknown type name 'uint64_t'
        uint64_t  local_addr;
      /usr/include/linux/rds.h:249:2: error: unknown type name 'uint64_t'
        uint64_t  remote_addr;
      /usr/include/linux/rds.h:252:4: error: unknown type name 'uint64_t'
          uint64_t compare;
      /usr/include/linux/rds.h:253:4: error: unknown type name 'uint64_t'
          uint64_t swap;
      /usr/include/linux/rds.h:256:4: error: unknown type name 'uint64_t'
          uint64_t add;
      /usr/include/linux/rds.h:259:4: error: unknown type name 'uint64_t'
          uint64_t compare;
      /usr/include/linux/rds.h:260:4: error: unknown type name 'uint64_t'
          uint64_t swap;
      /usr/include/linux/rds.h:261:4: error: unknown type name 'uint64_t'
          uint64_t compare_mask;
      /usr/include/linux/rds.h:262:4: error: unknown type name 'uint64_t'
          uint64_t swap_mask;
      /usr/include/linux/rds.h:265:4: error: unknown type name 'uint64_t'
          uint64_t add;
      /usr/include/linux/rds.h:266:4: error: unknown type name 'uint64_t'
          uint64_t nocarry_mask;
      /usr/include/linux/rds.h:269:2: error: unknown type name 'uint64_t'
        uint64_t flags;
      /usr/include/linux/rds.h:270:2: error: unknown type name 'uint64_t'
        uint64_t user_token;
      /usr/include/linux/rds.h:274:2: error: unknown type name 'uint64_t'
        uint64_t user_token;
      /usr/include/linux/rds.h:275:2: error: unknown type name 'int32_t'
        int32_t  status;
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      028a4198
    • Jeff Layton's avatar
      ceph: clean up unsafe d_parent accesses in build_dentry_path · c7a20ed2
      Jeff Layton authored
      
      [ Upstream commit c6b0b656 ]
      
      While we hold a reference to the dentry when build_dentry_path is
      called, we could end up racing with a rename that changes d_parent.
      Handle that situation correctly, by using the rcu_read_lock to
      ensure that the parent dentry and inode stick around long enough
      to safely check ceph_snap and ceph_ino.
      
      Link: http://tracker.ceph.com/issues/18148Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Reviewed-by: default avatarYan, Zheng <zyan@redhat.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7a20ed2
    • Alexandre Belloni's avatar
      i2c: at91: ensure state is restored after suspending · c128baf6
      Alexandre Belloni authored
      
      [ Upstream commit e3ccc921 ]
      
      When going to suspend, the I2C registers may be lost because the power to
      VDDcore is cut. Restore them when resuming.
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@free-electrons.com>
      Acked-by: default avatarLudovic Desroches <ludovic.desroches@microchip.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c128baf6
    • Thomas Petazzoni's avatar
      net: mvpp2: release reference to txq_cpu[] entry after unmapping · d7ecae72
      Thomas Petazzoni authored
      
      [ Upstream commit 36fb7435 ]
      
      The mvpp2_txq_bufs_free() function is called upon TX completion to DMA
      unmap TX buffers, and free the corresponding SKBs. It gets the
      references to the SKB to free and the DMA buffer to unmap from a per-CPU
      txq_pcpu data structure.
      
      However, the code currently increments the pointer to the next entry
      before doing the DMA unmap and freeing the SKB. It does not cause any
      visible problem because for a given SKB the TX completion is guaranteed
      to take place on the CPU where the TX was started. However, it is much
      more logical to increment the pointer to the next entry once the current
      entry has been completely unmapped/released.
      Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Acked-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7ecae72
    • Dan Carpenter's avatar
      scsi: scsi_dh_emc: return success in clariion_std_inquiry() · 693e6513
      Dan Carpenter authored
      
      [ Upstream commit 4d7d39a1 ]
      
      We accidentally return an uninitialized variable on success.
      
      Fixes: b6ff1b14 ("[SCSI] scsi_dh: Update EMC handler")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      693e6513
    • Grygorii Maistrenko's avatar
      slub: do not merge cache if slub_debug contains a never-merge flag · 9ac38e30
      Grygorii Maistrenko authored
      
      [ Upstream commit c6e28895 ]
      
      In case CONFIG_SLUB_DEBUG_ON=n, find_mergeable() gets debug features from
      commandline but never checks if there are features from the
      SLAB_NEVER_MERGE set.
      
      As a result selected by slub_debug caches are always mergeable if they
      have been created without a custom constructor set or without one of the
      SLAB_* debug features on.
      
      This moves the SLAB_NEVER_MERGE check below the flags update from
      commandline to make sure it won't merge the slab cache if one of the debug
      features is on.
      
      Link: http://lkml.kernel.org/r/20170101124451.GA4740@lp-laptop-dSigned-off-by: default avatarGrygorii Maistrenko <grygoriimkd@gmail.com>
      Reviewed-by: default avatarPekka Enberg <penberg@kernel.org>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ac38e30
    • Eric Ren's avatar
      ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock · 315689d2
      Eric Ren authored
      
      [ Upstream commit 439a36b8 ]
      
      We are in the situation that we have to avoid recursive cluster locking,
      but there is no way to check if a cluster lock has been taken by a precess
      already.
      
      Mostly, we can avoid recursive locking by writing code carefully.
      However, we found that it's very hard to handle the routines that are
      invoked directly by vfs code.  For instance:
      
        const struct inode_operations ocfs2_file_iops = {
            .permission     = ocfs2_permission,
            .get_acl        = ocfs2_iop_get_acl,
            .set_acl        = ocfs2_iop_set_acl,
        };
      
      Both ocfs2_permission() and ocfs2_iop_get_acl() call ocfs2_inode_lock(PR):
      
        do_sys_open
         may_open
          inode_permission
           ocfs2_permission
            ocfs2_inode_lock() <=== first time
             generic_permission
              get_acl
               ocfs2_iop_get_acl
        	ocfs2_inode_lock() <=== recursive one
      
      A deadlock will occur if a remote EX request comes in between two of
      ocfs2_inode_lock().  Briefly describe how the deadlock is formed:
      
      On one hand, OCFS2_LOCK_BLOCKED flag of this lockres is set in
      BAST(ocfs2_generic_handle_bast) when downconvert is started on behalf of
      the remote EX lock request.  Another hand, the recursive cluster lock
      (the second one) will be blocked in in __ocfs2_cluster_lock() because of
      OCFS2_LOCK_BLOCKED.  But, the downconvert never complete, why? because
      there is no chance for the first cluster lock on this node to be
      unlocked - we block ourselves in the code path.
      
      The idea to fix this issue is mostly taken from gfs2 code.
      
      1. introduce a new field: struct ocfs2_lock_res.l_holders, to keep track
         of the processes' pid who has taken the cluster lock of this lock
         resource;
      
      2. introduce a new flag for ocfs2_inode_lock_full:
         OCFS2_META_LOCK_GETBH; it means just getting back disk inode bh for
         us if we've got cluster lock.
      
      3. export a helper: ocfs2_is_locked_by_me() is used to check if we have
         got the cluster lock in the upper code path.
      
      The tracking logic should be used by some of the ocfs2 vfs's callbacks,
      to solve the recursive locking issue cuased by the fact that vfs
      routines can call into each other.
      
      The performance penalty of processing the holder list should only be
      seen at a few cases where the tracking logic is used, such as get/set
      acl.
      
      You may ask what if the first time we got a PR lock, and the second time
      we want a EX lock? fortunately, this case never happens in the real
      world, as far as I can see, including permission check,
      (get|set)_(acl|attr), and the gfs2 code also do so.
      
      [sfr@canb.auug.org.au remove some inlines]
      Link: http://lkml.kernel.org/r/20170117100948.11657-2-zren@suse.comSigned-off-by: default avatarEric Ren <zren@suse.com>
      Reviewed-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarJoseph Qi <jiangqi903@gmail.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      315689d2
    • Milan Broz's avatar
      crypto: xts - Add ECB dependency · d3335f56
      Milan Broz authored
      
      [ Upstream commit 12cb3a1c ]
      
      Since the
         commit f1c131b4
         crypto: xts - Convert to skcipher
      the XTS mode is based on ECB, so the mode must select
      ECB otherwise it can fail to initialize.
      Signed-off-by: default avatarMilan Broz <gmazyland@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d3335f56
    • Majd Dibbiny's avatar
      net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs · 02744a55
      Majd Dibbiny authored
      
      [ Upstream commit 95f1ba9a ]
      
      In the VF driver, module parameter mlx4_log_num_mgm_entry_size was
      mistakenly overwritten -- and in a manner which overrode the
      device-managed flow steering option encoded in the parameter.
      
      log_num_mgm_entry_size is a global module parameter which
      affects all ConnectX-3 PFs installed on that host.
      If a VF changes log_num_mgm_entry_size, this will affect all PFs
      which are probed subsequent to the change (by disabling DMFS for
      those PFs).
      
      Fixes: 3c439b55 ("mlx4_core: Allow choosing flow steering mode")
      Signed-off-by: default avatarMajd Dibbiny <majd@mellanox.com>
      Reviewed-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02744a55
    • Vijay Kumar's avatar
      sparc64: Migrate hvcons irq to panicked cpu · 7bf94b95
      Vijay Kumar authored
      
      [ Upstream commit 7dd4fcf5 ]
      
      On panic, all other CPUs are stopped except the one which had
      hit panic. To keep console alive, we need to migrate hvcons irq
      to panicked CPU.
      Signed-off-by: default avatarVijay Kumar <vijay.ac.kumar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7bf94b95
    • Shaohua Li's avatar
      md/linear: shutup lockdep warnning · d14591e8
      Shaohua Li authored
      
      [ Upstream commit d939cdfd ]
      
      Commit 03a9e24e(md linear: fix a race between linear_add() and
      linear_congested()) introduces the warnning.
      Acked-by: default avatarColy Li <colyli@suse.de>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d14591e8
    • Jaegeuk Kim's avatar
      f2fs: do not wait for writeback in write_begin · 48ca88f9
      Jaegeuk Kim authored
      
      [ Upstream commit 86d54795 ]
      
      Otherwise we can get livelock like below.
      
      [79880.428136] dbench          D    0 18405  18404 0x00000000
      [79880.428139] Call Trace:
      [79880.428142]  __schedule+0x219/0x6b0
      [79880.428144]  schedule+0x36/0x80
      [79880.428147]  schedule_timeout+0x243/0x2e0
      [79880.428152]  ? update_sd_lb_stats+0x16b/0x5f0
      [79880.428155]  ? ktime_get+0x3c/0xb0
      [79880.428157]  io_schedule_timeout+0xa6/0x110
      [79880.428161]  __lock_page+0xf7/0x130
      [79880.428164]  ? unlock_page+0x30/0x30
      [79880.428167]  pagecache_get_page+0x16b/0x250
      [79880.428171]  grab_cache_page_write_begin+0x20/0x40
      [79880.428182]  f2fs_write_begin+0xa2/0xdb0 [f2fs]
      [79880.428192]  ? f2fs_mark_inode_dirty_sync+0x16/0x30 [f2fs]
      [79880.428197]  ? kmem_cache_free+0x79/0x200
      [79880.428203]  ? __mark_inode_dirty+0x17f/0x360
      [79880.428206]  generic_perform_write+0xbb/0x190
      [79880.428213]  ? file_update_time+0xa4/0xf0
      [79880.428217]  __generic_file_write_iter+0x19b/0x1e0
      [79880.428226]  f2fs_file_write_iter+0x9c/0x180 [f2fs]
      [79880.428231]  __vfs_write+0xc5/0x140
      [79880.428235]  vfs_write+0xb2/0x1b0
      [79880.428238]  SyS_write+0x46/0xa0
      [79880.428242]  entry_SYSCALL_64_fastpath+0x1e/0xad
      
      Fixes: cae96a5c8ab6 ("f2fs: check io submission more precisely")
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      48ca88f9
    • Robbie Ko's avatar
      Btrfs: send, fix failure to rename top level inode due to name collision · 3109615b
      Robbie Ko authored
      
      [ Upstream commit 4dd9920d ]
      
      Under certain situations, an incremental send operation can fail due to a
      premature attempt to create a new top level inode (a direct child of the
      subvolume/snapshot root) whose name collides with another inode that was
      removed from the send snapshot.
      
      Consider the following example scenario.
      
      Parent snapshot:
      
        .                 (ino 256, gen 8)
        |---- a1/         (ino 257, gen 9)
        |---- a2/         (ino 258, gen 9)
      
      Send snapshot:
      
        .                 (ino 256, gen 3)
        |---- a2/         (ino 257, gen 7)
      
      In this scenario, when receiving the incremental send stream, the btrfs
      receive command fails like this (ran in verbose mode, -vv argument):
      
        rmdir a1
        mkfile o257-7-0
        rename o257-7-0 -> a2
        ERROR: rename o257-7-0 -> a2 failed: Is a directory
      
      What happens when computing the incremental send stream is:
      
      1) An operation to remove the directory with inode number 257 and
         generation 9 is issued.
      
      2) An operation to create the inode with number 257 and generation 7 is
         issued. This creates the inode with an orphanized name of "o257-7-0".
      
      3) An operation rename the new inode 257 to its final name, "a2", is
         issued. This is incorrect because inode 258, which has the same name
         and it's a child of the same parent (root inode 256), was not yet
         processed and therefore no rmdir operation for it was yet issued.
         The rename operation is issued because we fail to detect that the
         name of the new inode 257 collides with inode 258, because their
         parent, a subvolume/snapshot root (inode 256) has a different
         generation in both snapshots.
      
      So fix this by ignoring the generation value of a parent directory that
      matches a root inode (number 256) when we are checking if the name of the
      inode currently being processed collides with the name of some other
      inode that was not yet processed.
      
      We can achieve this scenario of different inodes with the same number but
      different generation values either by mounting a filesystem with the inode
      cache option (-o inode_cache) or by creating and sending snapshots across
      different filesystems, like in the following example:
      
        $ mkfs.btrfs -f /dev/sdb
        $ mount /dev/sdb /mnt
        $ mkdir /mnt/a1
        $ mkdir /mnt/a2
        $ btrfs subvolume snapshot -r /mnt /mnt/snap1
        $ btrfs send /mnt/snap1 -f /tmp/1.snap
        $ umount /mnt
      
        $ mkfs.btrfs -f /dev/sdc
        $ mount /dev/sdc /mnt
        $ touch /mnt/a2
        $ btrfs subvolume snapshot -r /mnt /mnt/snap2
        $ btrfs receive /mnt -f /tmp/1.snap
        # Take note that once the filesystem is created, its current
        # generation has value 7 so the inode from the second snapshot has
        # a generation value of 7. And after receiving the first snapshot
        # the filesystem is at a generation value of 10, because the call to
        # create the second snapshot bumps the generation to 8 (the snapshot
        # creation ioctl does a transaction commit), the receive command calls
        # the snapshot creation ioctl to create the first snapshot, which bumps
        # the filesystem's generation to 9, and finally when the receive
        # operation finishes it calls an ioctl to transition the first snapshot
        # (snap1) from RW mode to RO mode, which does another transaction commit
        # and bumps the filesystem's generation to 10.
        $ rm -f /tmp/1.snap
        $ btrfs send /mnt/snap1 -f /tmp/1.snap
        $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/2.snap
        $ umount /mnt
      
        $ mkfs.btrfs -f /dev/sdd
        $ mount /dev/sdd /mnt
        $ btrfs receive /mnt /tmp/1.snap
        # Receive of snapshot snap2 used to fail.
        $ btrfs receive /mnt /tmp/2.snap
      Signed-off-by: default avatarRobbie Ko <robbieko@synology.com>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      [Rewrote changelog to be more precise and clear]
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3109615b
    • Christophe JAILLET's avatar
      iio: adc: xilinx: Fix error handling · 4d134d83
      Christophe JAILLET authored
      
      [ Upstream commit ca1c39ef ]
      
      Reorder error handling labels in order to match the way resources have
      been allocated.
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Acked-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarJonathan Cameron <jic23@kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d134d83
    • Jarno Rajahalme's avatar
      netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value. · 5c65ed5c
      Jarno Rajahalme authored
      
      [ Upstream commit 4b86c459 ]
      
      Commit 4dee62b1 ("netfilter: nf_ct_expect: nf_ct_expect_insert()
      returns void") inadvertently changed the successful return value of
      nf_ct_expect_related_report() from 0 to 1 due to
      __nf_ct_expect_check() returning 1 on success.  Prevent this
      regression in the future by changing the return value of
      __nf_ct_expect_check() to 0 on success.
      Signed-off-by: default avatarJarno Rajahalme <jarno@ovn.org>
      Acked-by: default avatarJoe Stringer <joe@ovn.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5c65ed5c
    • Eric Dumazet's avatar
      net/mlx4_en: fix overflow in mlx4_en_init_timestamp() · 743a3ce1
      Eric Dumazet authored
      
      [ Upstream commit 47d3a075 ]
      
      The cited commit makes a great job of finding optimal shift/multiplier
      values assuming a 10 seconds wrap around, but forgot to change the
      overflow_period computation.
      
      It overflows in cyclecounter_cyc2ns(), and the final result is 804 ms,
      which is silly.
      
      Lets simply use 5 seconds, no need to recompute this, given how it is
      supposed to work.
      
      Later, we will use a timer instead of a work queue, since the new RX
      allocation schem will no longer need mlx4_en_recover_from_oom() and the
      service_task firing every 250 ms.
      
      Fixes: 31c128b6 ("net/mlx4_en: Choose time-stamping shift value according to HW frequency")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Eugenia Emantayev <eugenia@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      743a3ce1
    • Emmanuel Grumbach's avatar
      mac80211: fix power saving clients handling in iwlwifi · 7ed668ee
      Emmanuel Grumbach authored
      
      [ Upstream commit d98937f4 ]
      
      iwlwifi now supports RSS and can't let mac80211 track the
      PS state based on the Rx frames since they can come out of
      order. iwlwifi is now advertising AP_LINK_PS, and uses
      explicit notifications to teach mac80211 about the PS state
      of the stations and the PS poll / uAPSD trigger frames
      coming our way from the peers.
      
      Because of that, the TIM stopped being maintained in
      mac80211. I tried to fix this in commit c68df2e7
      ("mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE")
      but that was later reverted by Felix in commit 6c18a6b4
      ("Revert "mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE")
      since it broke drivers that do not implement set_tim.
      
      Since none of the drivers that set AP_LINK_PS have the
      set_tim() handler set besides iwlwifi, I can bail out in
      __sta_info_recalc_tim if AP_LINK_PS AND .set_tim is not
      implemented.
      Signed-off-by: default avatarEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ed668ee
    • Johannes Berg's avatar
      mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length · 3e8c1a04
      Johannes Berg authored
      
      [ Upstream commit ff4dd73d ]
      
      Unfortunately, the nla policy was defined to have HWSIM_ATTR_RADIO_NAME
      as an NLA_STRING, rather than NLA_NUL_STRING, so we can't use it as a
      NUL-terminated string in the kernel.
      
      Rather than break the API, kasprintf() the string to a new buffer to
      guarantee NUL termination.
      Reported-by: default avatarAndrew Zaborowski <andrew.zaborowski@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e8c1a04
    • Franck Demathieu's avatar
      irqchip/crossbar: Fix incorrect type of local variables · 4a464dac
      Franck Demathieu authored
      
      [ Upstream commit b28ace12 ]
      
      The max and entry variables are unsigned according to the dt-bindings.
      Fix following 3 sparse issues (-Wtypesign):
      
        drivers/irqchip/irq-crossbar.c:222:52: warning: incorrect type in argument 3 (different signedness)
        drivers/irqchip/irq-crossbar.c:222:52:    expected unsigned int [usertype] *out_value
        drivers/irqchip/irq-crossbar.c:222:52:    got int *<noident>
      
        drivers/irqchip/irq-crossbar.c:245:56: warning: incorrect type in argument 4 (different signedness)
        drivers/irqchip/irq-crossbar.c:245:56:    expected unsigned int [usertype] *out_value
        drivers/irqchip/irq-crossbar.c:245:56:    got int *<noident>
      
        drivers/irqchip/irq-crossbar.c:263:56: warning: incorrect type in argument 4 (different signedness)
        drivers/irqchip/irq-crossbar.c:263:56:    expected unsigned int [usertype] *out_value
        drivers/irqchip/irq-crossbar.c:263:56:    got int *<noident>
      Signed-off-by: default avatarFranck Demathieu <fdemathieu@gmail.com>
      Cc: marc.zyngier@arm.com
      Cc: jason@lakedaemon.net
      Link: http://lkml.kernel.org/r/20170223094855.6546-1-fdemathieu@gmail.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a464dac
    • Arnd Bergmann's avatar
      watchdog: kempld: fix gcc-4.3 build · 7e53f039
      Arnd Bergmann authored
      
      [ Upstream commit 3736d4eb ]
      
      gcc-4.3 can't decide whether the constant value in
      kempld_prescaler[PRESCALER_21] is built-time constant or
      not, and gets confused by the logic in do_div():
      
      drivers/watchdog/kempld_wdt.o: In function `kempld_wdt_set_stage_timeout':
      kempld_wdt.c:(.text.kempld_wdt_set_stage_timeout+0x130): undefined reference to `__aeabi_uldivmod'
      
      This adds a call to ACCESS_ONCE() to force it to not consider
      it to be constant, and leaves the more efficient normal case
      in place for modern compilers, using an #ifdef to annotate
      why we do this hack.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7e53f039
    • Peter Zijlstra's avatar
      locking/lockdep: Add nest_lock integrity test · 28eab3db
      Peter Zijlstra authored
      
      [ Upstream commit 7fb4a2ce ]
      
      Boqun reported that hlock->references can overflow. Add a debug test
      for that to generate a clear error when this happens.
      
      Without this, lockdep is likely to report a mysterious failure on
      unlock.
      Reported-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicolai Hähnle <Nicolai.Haehnle@amd.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      28eab3db
    • Greg Kroah-Hartman's avatar
      Revert "bsg-lib: don't free job in bsg_prepare_job" · d44e463c
      Greg Kroah-Hartman authored
      This reverts commit 668cee82 which was
      commit f507b54d upstream.
      
      Ben reports:
      	That function doesn't exist here (it was introduced in 4.13).
      	Instead, this backport has modified bsg_create_job(), creating a
      	leak.  Please revert this on the 3.18, 4.4 and 4.9 stable
      	branches.
      
      So I'm dropping it from here.
      Reported-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Ming Lei <ming.lei@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
      d44e463c
    • Parthasarathy Bhuvaragan's avatar
      tipc: use only positive error codes in messages · 01e3e631
      Parthasarathy Bhuvaragan authored
      
      [ Upstream commit aad06212 ]
      
      In commit e3a77561 ("tipc: split up function tipc_msg_eval()"),
      we have updated the function tipc_msg_lookup_dest() to set the error
      codes to negative values at destination lookup failures. Thus when
      the function sets the error code to -TIPC_ERR_NO_NAME, its inserted
      into the 4 bit error field of the message header as 0xf instead of
      TIPC_ERR_NO_NAME (1). The value 0xf is an unknown error code.
      
      In this commit, we set only positive error code.
      
      Fixes: e3a77561 ("tipc: split up function tipc_msg_eval()")
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01e3e631
    • Christoph Paasch's avatar
      net: Set sk_prot_creator when cloning sockets to the right proto · 68569970
      Christoph Paasch authored
      
      [ Upstream commit 9d538fa6 ]
      
      sk->sk_prot and sk->sk_prot_creator can differ when the app uses
      IPV6_ADDRFORM (transforming an IPv6-socket to an IPv4-one).
      Which is why sk_prot_creator is there to make sure that sk_prot_free()
      does the kmem_cache_free() on the right kmem_cache slab.
      
      Now, if such a socket gets transformed back to a listening socket (using
      connect() with AF_UNSPEC) we will allocate an IPv4 tcp_sock through
      sk_clone_lock() when a new connection comes in. But sk_prot_creator will
      still point to the IPv6 kmem_cache (as everything got copied in
      sk_clone_lock()). When freeing, we will thus put this
      memory back into the IPv6 kmem_cache although it was allocated in the
      IPv4 cache. I have seen memory corruption happening because of this.
      
      With slub-debugging and MEMCG_KMEM enabled this gives the warning
      	"cache_from_obj: Wrong slab cache. TCPv6 but object is from TCP"
      
      A C-program to trigger this:
      
      void main(void)
      {
              int fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP);
              int new_fd, newest_fd, client_fd;
              struct sockaddr_in6 bind_addr;
              struct sockaddr_in bind_addr4, client_addr1, client_addr2;
              struct sockaddr unsp;
              int val;
      
              memset(&bind_addr, 0, sizeof(bind_addr));
              bind_addr.sin6_family = AF_INET6;
              bind_addr.sin6_port = ntohs(42424);
      
              memset(&client_addr1, 0, sizeof(client_addr1));
              client_addr1.sin_family = AF_INET;
              client_addr1.sin_port = ntohs(42424);
              client_addr1.sin_addr.s_addr = inet_addr("127.0.0.1");
      
              memset(&client_addr2, 0, sizeof(client_addr2));
              client_addr2.sin_family = AF_INET;
              client_addr2.sin_port = ntohs(42421);
              client_addr2.sin_addr.s_addr = inet_addr("127.0.0.1");
      
              memset(&unsp, 0, sizeof(unsp));
              unsp.sa_family = AF_UNSPEC;
      
              bind(fd, (struct sockaddr *)&bind_addr, sizeof(bind_addr));
      
              listen(fd, 5);
      
              client_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
              connect(client_fd, (struct sockaddr *)&client_addr1, sizeof(client_addr1));
              new_fd = accept(fd, NULL, NULL);
              close(fd);
      
              val = AF_INET;
              setsockopt(new_fd, SOL_IPV6, IPV6_ADDRFORM, &val, sizeof(val));
      
              connect(new_fd, &unsp, sizeof(unsp));
      
              memset(&bind_addr4, 0, sizeof(bind_addr4));
              bind_addr4.sin_family = AF_INET;
              bind_addr4.sin_port = ntohs(42421);
              bind(new_fd, (struct sockaddr *)&bind_addr4, sizeof(bind_addr4));
      
              listen(new_fd, 5);
      
              client_fd = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
              connect(client_fd, (struct sockaddr *)&client_addr2, sizeof(client_addr2));
      
              newest_fd = accept(new_fd, NULL, NULL);
              close(new_fd);
      
              close(client_fd);
              close(new_fd);
      }
      
      As far as I can see, this bug has been there since the beginning of the
      git-days.
      Signed-off-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68569970
    • Willem de Bruijn's avatar
      packet: only test po->has_vnet_hdr once in packet_snd · 1299f7e1
      Willem de Bruijn authored
      
      [ Upstream commit da7c9561 ]
      
      Packet socket option po->has_vnet_hdr can be updated concurrently with
      other operations if no ring is attached.
      
      Do not test the option twice in packet_snd, as the value may change in
      between calls. A race on setsockopt disable may cause a packet > mtu
      to be sent without having GSO options set.
      
      Fixes: bfd5f4a3 ("packet: Add GSO/csum offload support.")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1299f7e1
    • Willem de Bruijn's avatar
      packet: in packet_do_bind, test fanout with bind_lock held · 1b6c80e7
      Willem de Bruijn authored
      
      [ Upstream commit 4971613c ]
      
      Once a socket has po->fanout set, it remains a member of the group
      until it is destroyed. The prot_hook must be constant and identical
      across sockets in the group.
      
      If fanout_add races with packet_do_bind between the test of po->fanout
      and taking the lock, the bind call may make type or dev inconsistent
      with that of the fanout group.
      
      Hold po->bind_lock when testing po->fanout to avoid this race.
      
      I had to introduce artificial delay (local_bh_enable) to actually
      observe the race.
      
      Fixes: dc99f600 ("packet: Add fanout support.")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b6c80e7
    • Alexander Potapenko's avatar
      tun: bail out from tun_get_user() if the skb is empty · ee534927
      Alexander Potapenko authored
      
      [ Upstream commit 2580c4c1 ]
      
      KMSAN (https://github.com/google/kmsan) reported accessing uninitialized
      skb->data[0] in the case the skb is empty (i.e. skb->len is 0):
      
      ================================================
      BUG: KMSAN: use of uninitialized memory in tun_get_user+0x19ba/0x3770
      CPU: 0 PID: 3051 Comm: probe Not tainted 4.13.0+ #3140
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
      ...
       __msan_warning_32+0x66/0xb0 mm/kmsan/kmsan_instr.c:477
       tun_get_user+0x19ba/0x3770 drivers/net/tun.c:1301
       tun_chr_write_iter+0x19f/0x300 drivers/net/tun.c:1365
       call_write_iter ./include/linux/fs.h:1743
       new_sync_write fs/read_write.c:457
       __vfs_write+0x6c3/0x7f0 fs/read_write.c:470
       vfs_write+0x3e4/0x770 fs/read_write.c:518
       SYSC_write+0x12f/0x2b0 fs/read_write.c:565
       SyS_write+0x55/0x80 fs/read_write.c:557
       do_syscall_64+0x242/0x330 arch/x86/entry/common.c:284
       entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:245
      ...
      origin:
      ...
       kmsan_poison_shadow+0x6e/0xc0 mm/kmsan/kmsan.c:211
       slab_alloc_node mm/slub.c:2732
       __kmalloc_node_track_caller+0x351/0x370 mm/slub.c:4351
       __kmalloc_reserve net/core/skbuff.c:138
       __alloc_skb+0x26a/0x810 net/core/skbuff.c:231
       alloc_skb ./include/linux/skbuff.h:903
       alloc_skb_with_frags+0x1d7/0xc80 net/core/skbuff.c:4756
       sock_alloc_send_pskb+0xabf/0xfe0 net/core/sock.c:2037
       tun_alloc_skb drivers/net/tun.c:1144
       tun_get_user+0x9a8/0x3770 drivers/net/tun.c:1274
       tun_chr_write_iter+0x19f/0x300 drivers/net/tun.c:1365
       call_write_iter ./include/linux/fs.h:1743
       new_sync_write fs/read_write.c:457
       __vfs_write+0x6c3/0x7f0 fs/read_write.c:470
       vfs_write+0x3e4/0x770 fs/read_write.c:518
       SYSC_write+0x12f/0x2b0 fs/read_write.c:565
       SyS_write+0x55/0x80 fs/read_write.c:557
       do_syscall_64+0x242/0x330 arch/x86/entry/common.c:284
       return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:245
      ================================================
      
      Make sure tun_get_user() doesn't touch skb->data[0] unless there is
      actual data.
      
      C reproducer below:
      ==========================
          // autogenerated by syzkaller (http://github.com/google/syzkaller)
      
          #define _GNU_SOURCE
      
          #include <fcntl.h>
          #include <linux/if_tun.h>
          #include <netinet/ip.h>
          #include <net/if.h>
          #include <string.h>
          #include <sys/ioctl.h>
      
          int main()
          {
            int sock = socket(PF_INET, SOCK_STREAM, IPPROTO_IP);
            int tun_fd = open("/dev/net/tun", O_RDWR);
            struct ifreq req;
            memset(&req, 0, sizeof(struct ifreq));
            strcpy((char*)&req.ifr_name, "gre0");
            req.ifr_flags = IFF_UP | IFF_MULTICAST;
            ioctl(tun_fd, TUNSETIFF, &req);
            ioctl(sock, SIOCSIFFLAGS, "gre0");
            write(tun_fd, "hi", 0);
            return 0;
          }
      ==========================
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee534927
    • Sabrina Dubroca's avatar
      l2tp: fix race condition in l2tp_tunnel_delete · b5f689d9
      Sabrina Dubroca authored
      
      [ Upstream commit 62b982ee ]
      
      If we try to delete the same tunnel twice, the first delete operation
      does a lookup (l2tp_tunnel_get), finds the tunnel, calls
      l2tp_tunnel_delete, which queues it for deletion by
      l2tp_tunnel_del_work.
      
      The second delete operation also finds the tunnel and calls
      l2tp_tunnel_delete. If the workqueue has already fired and started
      running l2tp_tunnel_del_work, then l2tp_tunnel_delete will queue the
      same tunnel a second time, and try to free the socket again.
      
      Add a dead flag to prevent firing the workqueue twice. Then we can
      remove the check of queue_work's result that was meant to prevent that
      race but doesn't.
      
      Reproducer:
      
          ip l2tp add tunnel tunnel_id 3000 peer_tunnel_id 4000 local 192.168.0.2 remote 192.168.0.1 encap udp udp_sport 5000 udp_dport 6000
          ip l2tp add session name l2tp1 tunnel_id 3000 session_id 1000 peer_session_id 2000
          ip link set l2tp1 up
          ip l2tp del tunnel tunnel_id 3000
          ip l2tp del tunnel tunnel_id 3000
      
      Fixes: f8ccac0e ("l2tp: put tunnel socket release on a workqueue")
      Reported-by: default avatarJianlin Shi <jishi@redhat.com>
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Acked-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5f689d9
    • Ridge Kennedy's avatar
      l2tp: Avoid schedule while atomic in exit_net · 110cf3dd
      Ridge Kennedy authored
      
      [ Upstream commit 12d656af ]
      
      While destroying a network namespace that contains a L2TP tunnel a
      "BUG: scheduling while atomic" can be observed.
      
      Enabling lockdep shows that this is happening because l2tp_exit_net()
      is calling l2tp_tunnel_closeall() (via l2tp_tunnel_delete()) from
      within an RCU critical section.
      
      l2tp_exit_net() takes rcu_read_lock_bh()
        << list_for_each_entry_rcu() >>
        l2tp_tunnel_delete()
          l2tp_tunnel_closeall()
            __l2tp_session_unhash()
              synchronize_rcu() << Illegal inside RCU critical section >>
      
      BUG: sleeping function called from invalid context
      in_atomic(): 1, irqs_disabled(): 0, pid: 86, name: kworker/u16:2
      INFO: lockdep is turned off.
      CPU: 2 PID: 86 Comm: kworker/u16:2 Tainted: G        W  O    4.4.6-at1 #2
      Hardware name: Xen HVM domU, BIOS 4.6.1-xs125300 05/09/2016
      Workqueue: netns cleanup_net
       0000000000000000 ffff880202417b90 ffffffff812b0013 ffff880202410ac0
       ffffffff81870de8 ffff880202417bb8 ffffffff8107aee8 ffffffff81870de8
       0000000000000c51 0000000000000000 ffff880202417be0 ffffffff8107b024
      Call Trace:
       [<ffffffff812b0013>] dump_stack+0x85/0xc2
       [<ffffffff8107aee8>] ___might_sleep+0x148/0x240
       [<ffffffff8107b024>] __might_sleep+0x44/0x80
       [<ffffffff810b21bd>] synchronize_sched+0x2d/0xe0
       [<ffffffff8109be6d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff8105c7bb>] ? __local_bh_enable_ip+0x6b/0xc0
       [<ffffffff816a1b00>] ? _raw_spin_unlock_bh+0x30/0x40
       [<ffffffff81667482>] __l2tp_session_unhash+0x172/0x220
       [<ffffffff81667397>] ? __l2tp_session_unhash+0x87/0x220
       [<ffffffff8166888b>] l2tp_tunnel_closeall+0x9b/0x140
       [<ffffffff81668c74>] l2tp_tunnel_delete+0x14/0x60
       [<ffffffff81668dd0>] l2tp_exit_net+0x110/0x270
       [<ffffffff81668d5c>] ? l2tp_exit_net+0x9c/0x270
       [<ffffffff815001c3>] ops_exit_list.isra.6+0x33/0x60
       [<ffffffff81501166>] cleanup_net+0x1b6/0x280
       ...
      
      This bug can easily be reproduced with a few steps:
      
       $ sudo unshare -n bash  # Create a shell in a new namespace
       # ip link set lo up
       # ip addr add 127.0.0.1 dev lo
       # ip l2tp add tunnel remote 127.0.0.1 local 127.0.0.1 tunnel_id 1 \
          peer_tunnel_id 1 udp_sport 50000 udp_dport 50000
       # ip l2tp add session name foo tunnel_id 1 session_id 1 \
          peer_session_id 1
       # ip link set foo up
       # exit  # Exit the shell, in turn exiting the namespace
       $ dmesg
       ...
       [942121.089216] BUG: scheduling while atomic: kworker/u16:3/13872/0x00000200
       ...
      
      To fix this, move the call to l2tp_tunnel_closeall() out of the RCU
      critical section, and instead call it from l2tp_tunnel_del_work(), which
      is running from the l2tp_wq workqueue.
      
      Fixes: 2b551c6e ("l2tp: close sessions before initiating tunnel delete")
      Signed-off-by: default avatarRidge Kennedy <ridge.kennedy@alliedtelesis.co.nz>
      Acked-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      110cf3dd
    • Alexey Kodanev's avatar
      vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmit · 93040aa1
      Alexey Kodanev authored
      
      [ Upstream commit 36f6ee22 ]
      
      When running LTP IPsec tests, KASan might report:
      
      BUG: KASAN: use-after-free in vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
      Read of size 4 at addr ffff880dc6ad1980 by task swapper/0/0
      ...
      Call Trace:
        <IRQ>
        dump_stack+0x63/0x89
        print_address_description+0x7c/0x290
        kasan_report+0x28d/0x370
        ? vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
        __asan_report_load4_noabort+0x19/0x20
        vti_tunnel_xmit+0xeee/0xff0 [ip_vti]
        ? vti_init_net+0x190/0x190 [ip_vti]
        ? save_stack_trace+0x1b/0x20
        ? save_stack+0x46/0xd0
        dev_hard_start_xmit+0x147/0x510
        ? icmp_echo.part.24+0x1f0/0x210
        __dev_queue_xmit+0x1394/0x1c60
      ...
      Freed by task 0:
        save_stack_trace+0x1b/0x20
        save_stack+0x46/0xd0
        kasan_slab_free+0x70/0xc0
        kmem_cache_free+0x81/0x1e0
        kfree_skbmem+0xb1/0xe0
        kfree_skb+0x75/0x170
        kfree_skb_list+0x3e/0x60
        __dev_queue_xmit+0x1298/0x1c60
        dev_queue_xmit+0x10/0x20
        neigh_resolve_output+0x3a8/0x740
        ip_finish_output2+0x5c0/0xe70
        ip_finish_output+0x4ba/0x680
        ip_output+0x1c1/0x3a0
        xfrm_output_resume+0xc65/0x13d0
        xfrm_output+0x1e4/0x380
        xfrm4_output_finish+0x5c/0x70
      
      Can be fixed if we get skb->len before dst_output().
      
      Fixes: b9959fd3 ("vti: switch to new ip tunnel code")
      Fixes: 22e1b23d ("vti6: Support inter address family tunneling.")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93040aa1
    • Meng Xu's avatar
      isdn/i4l: fetch the ppp_write buffer in one shot · d9cb4dc0
      Meng Xu authored
      
      [ Upstream commit 02388bf8 ]
      
      In isdn_ppp_write(), the header (i.e., protobuf) of the buffer is
      fetched twice from userspace. The first fetch is used to peek at the
      protocol of the message and reset the huptimer if necessary; while the
      second fetch copies in the whole buffer. However, given that buf resides
      in userspace memory, a user process can race to change its memory content
      across fetches. By doing so, we can either avoid resetting the huptimer
      for any type of packets (by first setting proto to PPP_LCP and later
      change to the actual type) or force resetting the huptimer for LCP
      packets.
      
      This patch changes this double-fetch behavior into two single fetches
      decided by condition (lp->isdn_device < 0 || lp->isdn_channel <0).
      A more detailed discussion can be found at
      https://marc.info/?l=linux-kernel&m=150586376926123&w=2Signed-off-by: default avatarMeng Xu <mengxu.gatech@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9cb4dc0
    • Yonghong Song's avatar
      bpf: one perf event close won't free bpf program attached by another perf event · 1a4f1ecd
      Yonghong Song authored
      
      [ Upstream commit ec9dd352 ]
      
      This patch fixes a bug exhibited by the following scenario:
        1. fd1 = perf_event_open with attr.config = ID1
        2. attach bpf program prog1 to fd1
        3. fd2 = perf_event_open with attr.config = ID1
           <this will be successful>
        4. user program closes fd2 and prog1 is detached from the tracepoint.
        5. user program with fd1 does not work properly as tracepoint
           no output any more.
      
      The issue happens at step 4. Multiple perf_event_open can be called
      successfully, but only one bpf prog pointer in the tp_event. In the
      current logic, any fd release for the same tp_event will free
      the tp_event->prog.
      
      The fix is to free tp_event->prog only when the closing fd
      corresponds to the one which registered the program.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1a4f1ecd
    • Willem de Bruijn's avatar
      packet: hold bind lock when rebinding to fanout hook · 5be6824b
      Willem de Bruijn authored
      
      [ Upstream commit 008ba2a1 ]
      
      Packet socket bind operations must hold the po->bind_lock. This keeps
      po->running consistent with whether the socket is actually on a ptype
      list to receive packets.
      
      fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
      binds the fanout object to receive through packet_rcv_fanout.
      
      Make it hold the po->bind_lock when testing po->running and rebinding.
      Else, it can race with other rebind operations, such as that in
      packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
      can result in a socket being added to a fanout group twice, causing
      use-after-free KASAN bug reports, among others.
      
      Reported independently by both trinity and syzkaller.
      Verified that the syzkaller reproducer passes after this patch.
      
      Fixes: dc99f600 ("packet: Add fanout support.")
      Reported-by: default avatarnixioaming <nixiaoming@huawei.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5be6824b