1. 07 Mar, 2016 26 commits
  2. 04 Mar, 2016 2 commits
  3. 03 Mar, 2016 12 commits
    • Borislav Petkov's avatar
      EDAC, mc_sysfs: Fix freeing bus' name · 1e719417
      Borislav Petkov authored
      commit 12e26969 upstream.
      
      I get the splat below when modprobing/rmmoding EDAC drivers. It happens
      because bus->name is invalid after bus_unregister() has run. The Code: section
      below corresponds to:
      
        .loc 1 1108 0
        movq    672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus
        .loc 1 1109 0
        popq    %rbx    #
      
        .loc 1 1108 0
        movq    (%rax), %rdi    # _7->name,
        jmp     kfree   #
      
      and %rax has some funky stuff 2030203020312030 which looks a lot like
      something walked over it.
      
      Fix that by saving the name ptr before doing stuff to string it points to.
      
        general protection fault: 0000 [#1] SMP
        Modules linked in: ...
        CPU: 4 PID: 10318 Comm: modprobe Tainted: G          I EN  3.12.51-11-default+ #48
        Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011
        task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000
        RIP: 0010:[<ffffffffa019da92>]  [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core]
        RSP: 0018:ffff88030da3fe28  EFLAGS: 00010292
        RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c
        RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286
        RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110
        R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68
        R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000
        FS:  00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0
        Stack:
        Call Trace:
          i7core_unregister_mci.isra.9
          i7core_remove
          pci_device_remove
          __device_release_driver
          driver_detach
          bus_remove_driver
          pci_unregister_driver
          i7core_exit
          SyS_delete_module
          system_call_fastpath
          0x7fc9bf426536
        Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b
        RIP  [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core]
         RSP <ffff88030da3fe28>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Fixes: 7a623c03 ("edac: rewrite the sysfs code to use struct device")
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1e719417
    • Jeff Layton's avatar
      locks: fix unlock when fcntl_setlk races with a close · 2da2c15f
      Jeff Layton authored
      commit 7f3697e2 upstream.
      
      Dmitry reported that he was able to reproduce the WARN_ON_ONCE that
      fires in locks_free_lock_context when the flc_posix list isn't empty.
      
      The problem turns out to be that we're basically rebuilding the
      file_lock from scratch in fcntl_setlk when we discover that the setlk
      has raced with a close. If the l_whence field is SEEK_CUR or SEEK_END,
      then we may end up with fl_start and fl_end values that differ from
      when the lock was initially set, if the file position or length of the
      file has changed in the interim.
      
      Fix this by just reusing the same lock request structure, and simply
      override fl_type value with F_UNLCK as appropriate. That ensures that
      we really are unlocking the lock that was initially set.
      
      While we're there, make sure that we do pop a WARN_ON_ONCE if the
      removal ever fails. Also return -EBADF in this event, since that's
      what we would have returned if the close had happened earlier.
      
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Fixes: c293621b (stale POSIX lock handling)
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarJeff Layton <jeff.layton@primarydata.com>
      Acked-by: default avatar"J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2da2c15f
    • Konrad Rzeszutek Wilk's avatar
      xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted. · 7949b7bf
      Konrad Rzeszutek Wilk authored
      commit 4d8c8bd6 upstream.
      
      Occasionaly PV guests would crash with:
      
      pciback 0000:00:00.1: Xen PCI mapped GSI0 to IRQ16
      BUG: unable to handle kernel paging request at 0000000d1a8c0be0
      .. snip..
        <ffffffff8139ce1b>] find_next_bit+0xb/0x10
        [<ffffffff81387f22>] cpumask_next_and+0x22/0x40
        [<ffffffff813c1ef8>] pci_device_probe+0xb8/0x120
        [<ffffffff81529097>] ? driver_sysfs_add+0x77/0xa0
        [<ffffffff815293e4>] driver_probe_device+0x1a4/0x2d0
        [<ffffffff813c1ddd>] ? pci_match_device+0xdd/0x110
        [<ffffffff81529657>] __device_attach_driver+0xa7/0xb0
        [<ffffffff815295b0>] ? __driver_attach+0xa0/0xa0
        [<ffffffff81527622>] bus_for_each_drv+0x62/0x90
        [<ffffffff8152978d>] __device_attach+0xbd/0x110
        [<ffffffff815297fb>] device_attach+0xb/0x10
        [<ffffffff813b75ac>] pci_bus_add_device+0x3c/0x70
        [<ffffffff813b7618>] pci_bus_add_devices+0x38/0x80
        [<ffffffff813dc34e>] pcifront_scan_root+0x13e/0x1a0
        [<ffffffff817a0692>] pcifront_backend_changed+0x262/0x60b
        [<ffffffff814644c6>] ? xenbus_gather+0xd6/0x160
        [<ffffffff8120900f>] ? put_object+0x2f/0x50
        [<ffffffff81465c1d>] xenbus_otherend_changed+0x9d/0xa0
        [<ffffffff814678ee>] backend_changed+0xe/0x10
        [<ffffffff81463a28>] xenwatch_thread+0xc8/0x190
        [<ffffffff810f22f0>] ? woken_wake_function+0x10/0x10
      
      which was the result of two things:
      
      When we call pci_scan_root_bus we would pass in 'sd' (sysdata)
      pointer which was an 'pcifront_sd' structure. However in the
      pci_device_add it expects that the 'sd' is 'struct sysdata' and
      sets the dev->node to what is in sd->node (offset 4):
      
      set_dev_node(&dev->dev, pcibus_to_node(bus));
      
       __pcibus_to_node(const struct pci_bus *bus)
      {
              const struct pci_sysdata *sd = bus->sysdata;
      
              return sd->node;
      }
      
      However our structure was pcifront_sd which had nothing at that
      offset:
      
      struct pcifront_sd {
              int                        domain;    /*     0     4 */
              /* XXX 4 bytes hole, try to pack */
              struct pcifront_device *   pdev;      /*     8     8 */
      }
      
      That is an hole - filled with garbage as we used kmalloc instead of
      kzalloc (the second problem).
      
      This patch fixes the issue by:
       1) Use kzalloc to initialize to a well known state.
       2) Put 'struct pci_sysdata' at the start of 'pcifront_sd'. That
          way access to the 'node' will access the right offset.
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7949b7bf
    • Al Viro's avatar
      do_last(): don't let a bogus return value from ->open() et.al. to confuse us · 66efd9e7
      Al Viro authored
      commit c80567c8 upstream.
      
      ... into returning a positive to path_openat(), which would interpret that
      as "symlink had been encountered" and proceed to corrupt memory, etc.
      It can only happen due to a bug in some ->open() instance or in some LSM
      hook, etc., so we report any such event *and* make sure it doesn't trick
      us into further unpleasantness.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      66efd9e7
    • Simon Guinot's avatar
      kernel/resource.c: fix muxed resource handling in __request_region() · 2976c211
      Simon Guinot authored
      commit 59ceeaaf upstream.
      
      In __request_region, if a conflict with a BUSY and MUXED resource is
      detected, then the caller goes to sleep and waits for the resource to be
      released.  A pointer on the conflicting resource is kept.  At wake-up
      this pointer is used as a parent to retry to request the region.
      
      A first problem is that this pointer might well be invalid (if for
      example the conflicting resource have already been freed).  Another
      problem is that the next call to __request_region() fails to detect a
      remaining conflict.  The previously conflicting resource is passed as a
      parameter and __request_region() will look for a conflict among the
      children of this resource and not at the resource itself.  It is likely
      to succeed anyway, even if there is still a conflict.
      
      Instead, the parent of the conflicting resource should be passed to
      __request_region().
      
      As a fix, this patch doesn't update the parent resource pointer in the
      case we have to wait for a muxed region right after.
      Reported-and-tested-by: default avatarVincent Pelletier <plr.vincent@gmail.com>
      Signed-off-by: default avatarSimon Guinot <simon.guinot@sequanux.org>
      Tested-by: default avatarVincent Donnefort <vdonnefort@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2976c211
    • Stefan Hajnoczi's avatar
      sunrpc/cache: fix off-by-one in qword_get() · 6e234db5
      Stefan Hajnoczi authored
      commit b7052cd7 upstream.
      
      The qword_get() function NUL-terminates its output buffer.  If the input
      string is in hex format \xXXXX... and the same length as the output
      buffer, there is an off-by-one:
      
        int qword_get(char **bpp, char *dest, int bufsize)
        {
            ...
            while (len < bufsize) {
                ...
                *dest++ = (h << 4) | l;
                len++;
            }
            ...
            *dest = '\0';
            return len;
        }
      
      This patch ensures the NUL terminator doesn't fall outside the output
      buffer.
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6e234db5
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix showing function event in available_events · b16ad9e9
      Steven Rostedt (Red Hat) authored
      commit d045437a upstream.
      
      The ftrace:function event is only displayed for parsing the function tracer
      data. It is not used to enable function tracing, and does not include an
      "enable" file in its event directory.
      
      Originally, this event was kept separate from other events because it did
      not have a ->reg parameter. But perf added a "reg" parameter for its use
      which caused issues, because it made the event available to functions where
      it was not compatible for.
      
      Commit 9b63776f "tracing: Do not enable function event with enable"
      added a TRACE_EVENT_FL_IGNORE_ENABLE flag that prevented the function event
      from being enabled by normal trace events. But this commit missed keeping
      the function event from being displayed by the "available_events" directory,
      which is used to show what events can be enabled by set_event.
      
      One documented way to enable all events is to:
      
       cat available_events > set_event
      
      But because the function event is displayed in the available_events, this
      now causes an INVALID error:
      
       cat: write error: Invalid argument
      Reported-by: default avatarChunyu Hu <chuhu@redhat.com>
      Fixes: 9b63776f "tracing: Do not enable function event with enable"
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b16ad9e9
    • Christian Borntraeger's avatar
      KVM: async_pf: do not warn on page allocation failures · 56ba745d
      Christian Borntraeger authored
      commit d7444794 upstream.
      
      In async_pf we try to allocate with NOWAIT to get an element quickly
      or fail. This code also handle failures gracefully. Lets silence
      potential page allocation failures under load.
      
      qemu-system-s39: page allocation failure: order:0,mode:0x2200000
      [...]
      Call Trace:
      ([<00000000001146b8>] show_trace+0xf8/0x148)
      [<000000000011476a>] show_stack+0x62/0xe8
      [<00000000004a36b8>] dump_stack+0x70/0x98
      [<0000000000272c3a>] warn_alloc_failed+0xd2/0x148
      [<000000000027709e>] __alloc_pages_nodemask+0x94e/0xb38
      [<00000000002cd36a>] new_slab+0x382/0x400
      [<00000000002cf7ac>] ___slab_alloc.constprop.30+0x2dc/0x378
      [<00000000002d03d0>] kmem_cache_alloc+0x160/0x1d0
      [<0000000000133db4>] kvm_setup_async_pf+0x6c/0x198
      [<000000000013dee8>] kvm_arch_vcpu_ioctl_run+0xd48/0xd58
      [<000000000012fcaa>] kvm_vcpu_ioctl+0x372/0x690
      [<00000000002f66f6>] do_vfs_ioctl+0x3be/0x510
      [<00000000002f68ec>] SyS_ioctl+0xa4/0xb8
      [<0000000000781c5e>] system_call+0xd6/0x264
      [<000003ffa24fa06a>] 0x3ffa24fa06a
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarDominik Dingel <dingel@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      56ba745d
    • Benjamin Coddington's avatar
      NFSv4: Fix a dentry leak on alias use · 3fa6f32a
      Benjamin Coddington authored
      commit d9dfd8d7 upstream.
      
      In the case where d_add_unique() finds an appropriate alias to use it will
      have already incremented the reference count.  An additional dget() to swap
      the open context's dentry is unnecessary and will leak a reference.
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Fixes: 275bb307 ("NFSv4: Move dentry instantiation into the NFSv4-...")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3fa6f32a
    • Christoph Hellwig's avatar
      nfs: fix nfs_size_to_loff_t · febaddd7
      Christoph Hellwig authored
      commit 50ab8ec7 upstream.
      
      See http: //www.infradead.org/rpr.html
      X-Evolution-Source: 1451162204.2173.11@leira.trondhjem.org
      Content-Transfer-Encoding: 8bit
      Mime-Version: 1.0
      
      We support OFFSET_MAX just fine, so don't round down below it.  Also
      switch to using min_t to make the helper more readable.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Fixes: 433c9237 ("NFS: Clean up nfs_size_to_loff_t()")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      febaddd7
    • Sebastian Andrzej Siewior's avatar
      PCI/AER: Flush workqueue on device remove to avoid use-after-free · b98e34c3
      Sebastian Andrzej Siewior authored
      commit 4ae2182b upstream.
      
      A Root Port's AER structure (rpc) contains a queue of events.  aer_irq()
      enqueues AER status information and schedules aer_isr() to dequeue and
      process it.  When we remove a device, aer_remove() waits for the queue to
      be empty, then frees the rpc struct.
      
      But aer_isr() references the rpc struct after dequeueing and possibly
      emptying the queue, which can cause a use-after-free error as in the
      following scenario with two threads, aer_isr() on the left and a
      concurrent aer_remove() on the right:
      
        Thread A                      Thread B
        --------                      --------
        aer_irq():
          rpc->prod_idx++
                                      aer_remove():
                                        wait_event(rpc->prod_idx == rpc->cons_idx)
                                        # now blocked until queue becomes empty
        aer_isr():                      # ...
          rpc->cons_idx++               # unblocked because queue is now empty
          ...                           kfree(rpc)
          mutex_unlock(&rpc->rpc_mutex)
      
      To prevent this problem, use flush_work() to wait until the last scheduled
      instance of aer_isr() has completed before freeing the rpc struct in
      aer_remove().
      
      I reproduced this use-after-free by flashing a device FPGA and
      re-enumerating the bus to find the new device.  With SLUB debug, this
      crashes with 0x6b bytes (POISON_FREE, the use-after-free magic number) in
      GPR25:
      
        pcieport 0000:00:00.0: AER: Multiple Corrected error received: id=0000
        Unable to handle kernel paging request for data at address 0x27ef9e3e
        Workqueue: events aer_isr
        GPR24: dd6aa000 6b6b6b6b 605f8378 605f8360 d99b12c0 604fc674 606b1704 d99b12c0
        NIP [602f5328] pci_walk_bus+0xd4/0x104
      
      [bhelgaas: changelog, stable tag]
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b98e34c3
    • Tejun Heo's avatar
      libata: fix sff host state machine locking while polling · 753a93a1
      Tejun Heo authored
      commit 8eee1d3e upstream.
      
      The bulk of ATA host state machine is implemented by
      ata_sff_hsm_move().  The function is called from either the interrupt
      handler or, if polling, a work item.  Unlike from the interrupt path,
      the polling path calls the function without holding the host lock and
      ata_sff_hsm_move() selectively grabs the lock.
      
      This is completely broken.  If an IRQ triggers while polling is in
      progress, the two can easily race and end up accessing the hardware
      and updating state machine state at the same time.  This can put the
      state machine in an illegal state and lead to a crash like the
      following.
      
        kernel BUG at drivers/ata/libata-sff.c:1302!
        invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
        Modules linked in:
        CPU: 1 PID: 10679 Comm: syz-executor Not tainted 4.5.0-rc1+ #300
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        task: ffff88002bd00000 ti: ffff88002e048000 task.ti: ffff88002e048000
        RIP: 0010:[<ffffffff83a83409>]  [<ffffffff83a83409>] ata_sff_hsm_move+0x619/0x1c60
        ...
        Call Trace:
         <IRQ>
         [<ffffffff83a84c31>] __ata_sff_port_intr+0x1e1/0x3a0 drivers/ata/libata-sff.c:1584
         [<ffffffff83a85611>] ata_bmdma_port_intr+0x71/0x400 drivers/ata/libata-sff.c:2877
         [<     inline     >] __ata_sff_interrupt drivers/ata/libata-sff.c:1629
         [<ffffffff83a85bf3>] ata_bmdma_interrupt+0x253/0x580 drivers/ata/libata-sff.c:2902
         [<ffffffff81479f98>] handle_irq_event_percpu+0x108/0x7e0 kernel/irq/handle.c:157
         [<ffffffff8147a717>] handle_irq_event+0xa7/0x140 kernel/irq/handle.c:205
         [<ffffffff81484573>] handle_edge_irq+0x1e3/0x8d0 kernel/irq/chip.c:623
         [<     inline     >] generic_handle_irq_desc include/linux/irqdesc.h:146
         [<ffffffff811a92bc>] handle_irq+0x10c/0x2a0 arch/x86/kernel/irq_64.c:78
         [<ffffffff811a7e4d>] do_IRQ+0x7d/0x1a0 arch/x86/kernel/irq.c:240
         [<ffffffff86653d4c>] common_interrupt+0x8c/0x8c arch/x86/entry/entry_64.S:520
         <EOI>
         [<     inline     >] rcu_lock_acquire include/linux/rcupdate.h:490
         [<     inline     >] rcu_read_lock include/linux/rcupdate.h:874
         [<ffffffff8164b4a1>] filemap_map_pages+0x131/0xba0 mm/filemap.c:2145
         [<     inline     >] do_fault_around mm/memory.c:2943
         [<     inline     >] do_read_fault mm/memory.c:2962
         [<     inline     >] do_fault mm/memory.c:3133
         [<     inline     >] handle_pte_fault mm/memory.c:3308
         [<     inline     >] __handle_mm_fault mm/memory.c:3418
         [<ffffffff816efb16>] handle_mm_fault+0x2516/0x49a0 mm/memory.c:3447
         [<ffffffff8127dc16>] __do_page_fault+0x376/0x960 arch/x86/mm/fault.c:1238
         [<ffffffff8127e358>] trace_do_page_fault+0xe8/0x420 arch/x86/mm/fault.c:1331
         [<ffffffff8126f514>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:264
         [<ffffffff86655578>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:986
      
      Fix it by ensuring that the polling path is holding the host lock
      before entering ata_sff_hsm_move() so that all hardware accesses and
      state updates are performed under the host lock.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-and-tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Link: http://lkml.kernel.org/g/CACT4Y+b_JsOxJu2EZyEf+mOXORc_zid5V1-pLZSroJVxyWdSpw@mail.gmail.comSigned-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      753a93a1