1. 26 Oct, 2015 40 commits
    • Ludovic Desroches's avatar
      irqchip/atmel-aic5: Use per chip mask caches in mask/unmask() · 325e96f1
      Ludovic Desroches authored
      commit d32dc9aa upstream.
      
      When masking/unmasking interrupts, mask_cache is updated and used later
      for suspend/resume. Unfortunately, it always was the mask_cache
      associated with the first irq chip which was updated. So when performing
      resume, only irqs 0-31 could be enabled.
      
      Fixes: b1479ebb ("irqchip: atmel-aic: Add atmel AIC/AIC5 drivers")
      Signed-off-by: default avatarLudovic Desroches <ludovic.desroches@atmel.com>
      Cc: <sasha.levin@oracle.com>
      Cc: <linux-arm-kernel@lists.infradead.org>
      Cc: <nicolas.ferre@atmel.com>
      Cc: <alexandre.belloni@free-electrons.com>
      Cc: <boris.brezillon@free-electrons.com>
      Cc: <Wenyou.Yang@atmel.com>
      Cc: <jason@lakedaemon.net>
      Cc: <marc.zyngier@arm.com>
      Link: http://lkml.kernel.org/r/1442843173-2390-1-git-send-email-ludovic.desroches@atmel.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      325e96f1
    • Mathias Nyman's avatar
      xhci: init command timeout timer earlier to avoid deleting it uninitialized · 4cd82cf7
      Mathias Nyman authored
      commit cc8e4fc0 upstream.
      
      Don't check if timer is running with a timer_pending() before
      deleting it with del_timer_sync(), this defies the whole point of
      the sync part and can cause a possible race.
      
      Instead we just want to make sure the timer is initialized early enough
      before we have a chance to delete it.
      Reported-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4cd82cf7
    • Julia Lawall's avatar
      xhci-mem: Use setup_timer · 064a387a
      Julia Lawall authored
      commit 9e08a03d upstream.
      
      Convert a call to init_timer and accompanying intializations of
      the timer's data and function fields to a call to setup_timer.
      
      A simplified version of the semantic match that fixes this problem is as
      follows: (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression t,f,d;
      @@
      
      -init_timer(&t);
      +setup_timer(&t,f,d);
      -t.data = d;
      -t.function = f;
      // </smpl>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [ kamal: 3.19-stable prereq for "cc8e4fc0 xhci: init command timeout timer
        earlier to avoid deleting it uninitialized" ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      064a387a
    • Mathias Nyman's avatar
      xhci: change xhci 1.0 only restrictions to support xhci 1.1 · 273be7b4
      Mathias Nyman authored
      commit dca77945 upstream.
      
      Some changes between xhci 0.96 and xhci 1.0 specifications forced us to
      check the hci version in code, some of these checks were implemented as
      hci_version == 1.0, which will not work with new xhci 1.1 controllers.
      
      xhci 1.1 behaves similar to xhci 1.0 in these cases, so change these
      checks to hci_version >= 1.0
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      273be7b4
    • Roger Quadros's avatar
      usb: xhci: exit early in xhci_setup_device() if we're halted or dying · b7027671
      Roger Quadros authored
      commit 448116bf upstream.
      
      During quick plug/removal of OTG adapter during dual-role testing
      it can happen that xhci_alloc_device() is called for the newly
      detected device after the DRD library has called xhci_stop to
      remove the HCD.
      
      If that is the case, just fail early to prevent the following warning.
      
      [  154.732649] hub 4-0:1.0: USB hub found
      [  154.742204] hub 4-0:1.0: 1 port detected
      [  154.824458] hub 3-0:1.0: state 7 ports 1 chg 0002 evt 0000
      [  154.854609] hub 4-0:1.0: state 7 ports 1 chg 0000 evt 0000
      [  154.944430] usb 3-1: new high-speed USB device number 2 using xhci-hcd
      [  154.951009] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
      [  155.038191] xhci-hcd xhci-hcd.0.auto: remove, state 4
      [  155.043315] usb usb4: USB disconnect, device number 1
      [  155.055270] xhci-hcd xhci-hcd.0.auto: xhci_stop
      [  155.060094] xhci-hcd xhci-hcd.0.auto: USB bus 4 deregistered
      [  155.066576] xhci-hcd xhci-hcd.0.auto: remove, state 1
      [  155.071710] usb usb3: USB disconnect, device number 1
      [  155.077124] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
      [  155.082389] ------------[ cut here ]------------
      [  155.087690] WARNING: CPU: 0 PID: 72 at drivers/usb/host/xhci.c:3800 xhci_setup_device+0x410/0x484 [xhci_hcd]()
      [  155.097861] Modules linked in: sd_mod usb_storage scsi_mod usb_f_ss_lb g_zero libcomposite ipv6 xhci_plat_hcd xhci_hcd usbcore dwc3 udc_core evdev ti_am335x_adc joydev kfifo_buf industrialio snd_soc_simple_cc
      [  155.146734] CPU: 0 PID: 72 Comm: kworker/0:3 Tainted: G        W       4.1.4-00834-gcd9380b-dirty #50
      [  155.156073] Hardware name: Generic AM43 (Flattened Device Tree)
      [  155.162117] Workqueue: usb_hub_wq hub_event [usbcore]
      [  155.167249] Backtrace:
      [  155.169751] [<c0012af0>] (dump_backtrace) from [<c0012c8c>] (show_stack+0x18/0x1c)
      [  155.177390]  r6:c089d4a4 r5:ffffffff r4:00000000 r3:ee46c000
      [  155.183137] [<c0012c74>] (show_stack) from [<c05f7c14>] (dump_stack+0x84/0xd0)
      [  155.190446] [<c05f7b90>] (dump_stack) from [<c00439ac>] (warn_slowpath_common+0x80/0xbc)
      [  155.198605]  r7:00000009 r6:00000ed8 r5:bf27eb70 r4:00000000
      [  155.204348] [<c004392c>] (warn_slowpath_common) from [<c0043a0c>] (warn_slowpath_null+0x24/0x2c)
      [  155.213202]  r8:ee49f000 r7:ee7c0004 r6:00000000 r5:ee7c0158 r4:ee7c0000
      [  155.220051] [<c00439e8>] (warn_slowpath_null) from [<bf27eb70>] (xhci_setup_device+0x410/0x484 [xhci_hcd])
      [  155.229816] [<bf27e760>] (xhci_setup_device [xhci_hcd]) from [<bf27ec10>] (xhci_address_device+0x14/0x18 [xhci_hcd])
      [  155.240415]  r10:ee598200 r9:00000001 r8:00000002 r7:00000001 r6:00000003 r5:00000002
      [  155.248363]  r4:ee49f000
      [  155.250978] [<bf27ebfc>] (xhci_address_device [xhci_hcd]) from [<bf20cb94>] (hub_port_init+0x1b8/0xa9c [usbcore])
      [  155.261403] [<bf20c9dc>] (hub_port_init [usbcore]) from [<bf2101e0>] (hub_event+0x738/0x1020 [usbcore])
      [  155.270874]  r10:ee598200 r9:ee7c0000 r8:ee7c0038 r7:ee518800 r6:ee49f000 r5:00000001
      [  155.278822]  r4:00000000
      [  155.281426] [<bf20faa8>] (hub_event [usbcore]) from [<c005754c>] (process_one_work+0x128/0x340)
      [  155.290196]  r10:00000000 r9:00000003 r8:00000000 r7:fedfa000 r6:eeec5400 r5:ee598314
      [  155.298151]  r4:ee434380
      [  155.300718] [<c0057424>] (process_one_work) from [<c00578f8>] (worker_thread+0x158/0x49c)
      [  155.308963]  r10:ee434380 r9:00000003 r8:eeec5400 r7:00000008 r6:ee434398 r5:eeec5400
      [  155.316913]  r4:eeec5414
      [  155.319482] [<c00577a0>] (worker_thread) from [<c005cc40>] (kthread+0xdc/0xf8)
      [  155.326765]  r10:00000000 r9:00000000 r8:00000000 r7:c00577a0 r6:ee434380 r5:ee4441c0
      [  155.334713]  r4:00000000 r3:00000000
      [  155.338341] [<c005cb64>] (kthread) from [<c000fc08>] (ret_from_fork+0x14/0x2c)
      [  155.345626]  r7:00000000 r6:00000000 r5:c005cb64 r4:ee4441c0
      [  155.356108] ---[ end trace a58d34c223b190e6 ]---
      [  155.360783] xhci-hcd xhci-hcd.0.auto: Virt dev invalid for slot_id 0x1!
      [  155.574404] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
      [  155.579667] ------------[ cut here ]------------
      Signed-off-by: default avatarRoger Quadros <rogerq@ti.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b7027671
    • Roger Quadros's avatar
      usb: xhci: Clear XHCI_STATE_DYING on start · f33c9542
      Roger Quadros authored
      commit e5bfeab0 upstream.
      
      For whatever reason if XHCI died in the previous instant
      then it will never recover on the next xhci_start unless we
      clear the DYING flag.
      Signed-off-by: default avatarRoger Quadros <rogerq@ti.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f33c9542
    • Roger Quadros's avatar
      usb: xhci: lock mutex on xhci_stop · d27dd79d
      Roger Quadros authored
      commit 85ac90f8 upstream.
      
      Else it races with xhci_setup_device
      Signed-off-by: default avatarRoger Quadros <rogerq@ti.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d27dd79d
    • Mathias Nyman's avatar
      xhci: give command abortion one more chance before killing xhci · dcede601
      Mathias Nyman authored
      commit a6809ffd upstream.
      
      We want to give the command abortion an additional try to stop
      the command ring before we completely hose xhci.
      Tested-by: default avatarVincent Pelletier <plr.vincent@gmail.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [ luis: backported to 3.16:
        - xhci_handshake() has an extra 'xhci' parameter in 3.16 ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      dcede601
    • Mathias Nyman's avatar
      usb: Use the USB_SS_MULT() macro to get the burst multiplier. · f1b80285
      Mathias Nyman authored
      commit ff30cbc8 upstream.
      
      Bits 1:0 of the bmAttributes are used for the burst multiplier.
      The rest of the bits used to be reserved (zero), but USB3.1 takes bit 7
      into use.
      
      Use the existing USB_SS_MULT() macro instead to make sure the mult value
      and hence max packet calculations are correct for USB3.1 devices.
      
      Note that burst multiplier in bmAttributes is zero based and that
      the USB_SS_MULT() macro adds one.
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      f1b80285
    • Paolo Bonzini's avatar
      KVM: x86: trap AMD MSRs for the TSeg base and mask · 6b003c59
      Paolo Bonzini authored
      commit 3afb1121 upstream.
      
      These have roughly the same purpose as the SMRR, which we do not need
      to implement in KVM.  However, Linux accesses MSR_K8_TSEG_ADDR at
      boot, which causes problems when running a Xen dom0 under KVM.
      Just return 0, meaning that processor protection of SMRAM is not
      in effect.
      Reported-by: default avatarM A Young <m.a.young@durham.ac.uk>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [ luis: backported to 3.16:
        - file rename: arch/x86/include/asm/msr-index.h ->
          arch/x86/include/uapi/asm/msr-index.h
        - adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6b003c59
    • Dominik Dingel's avatar
      sched: access local runqueue directly in single_task_running · b2a1383e
      Dominik Dingel authored
      commit 00cc1633 upstream.
      
      Commit 2ee507c4 ("sched: Add function single_task_running to let a task
      check if it is the only task running on a cpu") referenced the current
      runqueue with the smp_processor_id.  When CONFIG_DEBUG_PREEMPT is enabled,
      that is only allowed if preemption is disabled or the currrent task is
      bound to the local cpu (e.g. kernel worker).
      
      With commit f7819512 ("kvm: add halt_poll_ns module parameter") KVM
      calls single_task_running. If CONFIG_DEBUG_PREEMPT is enabled that
      generates a lot of kernel messages.
      
      To avoid adding preemption in that cases, as it would limit the usefulness,
      we change single_task_running to access directly the cpu local runqueue.
      
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Fixes: 2ee507c4Signed-off-by: default avatarDominik Dingel <dingel@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b2a1383e
    • Luis Henriques's avatar
      zram: fix possible use after free in zcomp_create() · 6cf6073d
      Luis Henriques authored
      commit 3aaf14da upstream.
      
      zcomp_create() verifies the success of zcomp_strm_{multi,single}_create()
      through comp->stream, which can potentially be pointing to memory that
      was freed if these functions returned an error.
      
      While at it, replace a 'ERR_PTR(-ENOMEM)' by a more generic
      'ERR_PTR(error)' as in the future zcomp_strm_{multi,siggle}_create()
      could return other error codes.  Function documentation updated
      accordingly.
      
      Fixes: beca3ec7 ("zram: add multi stream functionality")
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Acked-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6cf6073d
    • Kyle Evans's avatar
      hp-wmi: limit hotkey enable · 927b7a9c
      Kyle Evans authored
      commit 8a1513b4 upstream.
      
      Do not write initialize magic on systems that do not have
      feature query 0xb. Fixes Bug #82451.
      
      Redefine FEATURE_QUERY to align with 0xb and FEATURE2 with 0xd
      for code clearity.
      
      Add a new test function, hp_wmi_bios_2008_later() & simplify
      hp_wmi_bios_2009_later(), which fixes a bug in cases where
      an improper value is returned. Probably also fixes Bug #69131.
      
      Add missing __init tag.
      Signed-off-by: default avatarKyle Evans <kvans32@gmail.com>
      Signed-off-by: default avatarDarren Hart <dvhart@linux.intel.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      927b7a9c
    • Shawn Lin's avatar
      staging: ion: fix corruption of ion_import_dma_buf · 8012f382
      Shawn Lin authored
      commit 6fa92e2b upstream.
      
      we found this issue but still exit in lastest kernel. Simply
      keep ion_handle_create under mutex_lock to avoid this race.
      
      WARNING: CPU: 2 PID: 2648 at drivers/staging/android/ion/ion.c:512 ion_handle_add+0xb4/0xc0()
      ion_handle_add: buffer already found.
      Modules linked in: iwlmvm iwlwifi mac80211 cfg80211 compat
      CPU: 2 PID: 2648 Comm: TimedEventQueue Tainted: G        W    3.14.0 #7
       00000000 00000000 9a3efd2c 80faf273 9a3efd6c 9a3efd5c 80935dc9 811d7fd3
       9a3efd88 00000a58 812208a0 00000200 80e128d4 80e128d4 8d4ae00c a8cd8600
       a8cd8094 9a3efd74 80935e0e 00000009 9a3efd6c 811d7fd3 9a3efd88 9a3efd9c
      Call Trace:
        [<80faf273>] dump_stack+0x48/0x69
        [<80935dc9>] warn_slowpath_common+0x79/0x90
        [<80e128d4>] ? ion_handle_add+0xb4/0xc0
        [<80e128d4>] ? ion_handle_add+0xb4/0xc0
        [<80935e0e>] warn_slowpath_fmt+0x2e/0x30
        [<80e128d4>] ion_handle_add+0xb4/0xc0
        [<80e144cc>] ion_import_dma_buf+0x8c/0x110
        [<80c517c4>] reg_init+0x364/0x7d0
        [<80993363>] ? futex_wait+0x123/0x210
        [<80992e0e>] ? get_futex_key+0x16e/0x1e0
        [<8099308f>] ? futex_wake+0x5f/0x120
        [<80c51e19>] vpu_service_ioctl+0x1e9/0x500
        [<80994aec>] ? do_futex+0xec/0x8e0
        [<80971080>] ? prepare_to_wait_event+0xc0/0xc0
        [<80c51c30>] ? reg_init+0x7d0/0x7d0
        [<80a22562>] do_vfs_ioctl+0x2d2/0x4c0
        [<80b198ad>] ? inode_has_perm.isra.41+0x2d/0x40
        [<80b199cf>] ? file_has_perm+0x7f/0x90
        [<80b1a5f7>] ? selinux_file_ioctl+0x47/0xf0
        [<80a227a8>] SyS_ioctl+0x58/0x80
        [<80fb45e8>] syscall_call+0x7/0x7
        [<80fb0000>] ? mmc_do_calc_max_discard+0xab/0xe4
      
      Fixes: 83271f62 ("ion: hold reference to handle...")
      Signed-off-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Reviewed-by: default avatarLaura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8012f382
    • Marc Zyngier's avatar
      arm: KVM: Disable virtual timer even if the guest is not using it · 72eee9a7
      Marc Zyngier authored
      commit 688bc577 upstream.
      
      When running a guest with the architected timer disabled (with QEMU and
      the kernel_irqchip=off option, for example), it is important to make
      sure the timer gets turned off. Otherwise, the guest may try to
      enable it anyway, leading to a screaming HW interrupt.
      
      The fix is to unconditionally turn off the virtual timer on guest
      exit.
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      72eee9a7
    • Marc Zyngier's avatar
      arm64: KVM: Disable virtual timer even if the guest is not using it · a585caff
      Marc Zyngier authored
      commit c4cbba9f upstream.
      
      When running a guest with the architected timer disabled (with QEMU and
      the kernel_irqchip=off option, for example), it is important to make
      sure the timer gets turned off. Otherwise, the guest may try to
      enable it anyway, leading to a screaming HW interrupt.
      
      The fix is to unconditionally turn off the virtual timer on guest
      exit.
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a585caff
    • Martin Schwidefsky's avatar
      s390/compat: correct uc_sigmask of the compat signal frame · 09ed13de
      Martin Schwidefsky authored
      commit 8d4bd0ed upstream.
      
      The uc_sigmask in the ucontext structure is an array of words to keep
      the 64 signal bits (or 1024 if you ask glibc but the kernel sigset_t
      only has 64 bits).
      
      For 64 bit the sigset_t contains a single 8 byte word, but for 31 bit
      there are two 4 byte words. The compat signal handler code uses a
      simple copy of the 64 bit sigset_t to the 31 bit compat_sigset_t.
      As s390 is a big-endian architecture this is incorrect, the two words
      in the 31 bit sigset_t array need to be swapped.
      Reported-by: default avatarStefan Liebler <stli@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      09ed13de
    • Will Deacon's avatar
      arm64: errata: add module build workaround for erratum #843419 · 8235e04e
      Will Deacon authored
      commit df057cc7 upstream.
      
      Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can
      lead to a memory access using an incorrect address in certain sequences
      headed by an ADRP instruction.
      
      There is a linker fix to generate veneers for ADRP instructions, but
      this doesn't work for kernel modules which are built as unlinked ELF
      objects.
      
      This patch adds a new config option for the erratum which, when enabled,
      builds kernel modules with the mcmodel=large flag. This uses absolute
      addressing for all kernel symbols, thereby removing the use of ADRP as
      a PC-relative form of addressing. The ADRP relocs are removed from the
      module loader so that we fail to load any potentially affected modules.
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8235e04e
    • Will Deacon's avatar
      arm64: compat: fix vfp save/restore across signal handlers in big-endian · 67684b80
      Will Deacon authored
      commit bdec97a8 upstream.
      
      When saving/restoring the VFP registers from a compat (AArch32)
      signal frame, we rely on the compat registers forming a prefix of the
      native register file and therefore make use of copy_{to,from}_user to
      transfer between the native fpsimd_state and the compat_vfp_sigframe.
      
      Unfortunately, this doesn't work so well in a big-endian environment.
      Our fpsimd save/restore code operates directly on 128-bit quantities
      (Q registers) whereas the compat_vfp_sigframe represents the registers
      as an array of 64-bit (D) registers. The architecture packs the compat D
      registers into the Q registers, with the least significant bytes holding
      the lower register. Consequently, we need to swap the 64-bit halves when
      converting between these two representations on a big-endian machine.
      
      This patch replaces the __copy_{to,from}_user invocations in our
      compat VFP signal handling code with explicit __put_user loops that
      operate on 64-bit values and swap them accordingly.
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      67684b80
    • Pavel Fedin's avatar
      arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources · 669c12b9
      Pavel Fedin authored
      commit c2f58514 upstream.
      
      Until b26e5fda ("arm/arm64: KVM: introduce per-VM ops"),
      kvm_vgic_map_resources() used to include a check on irqchip_in_kernel(),
      and vgic_v2_map_resources() still has it.
      
      But now vm_ops are not initialized until we call kvm_vgic_create().
      Therefore kvm_vgic_map_resources() can being called without a VGIC,
      and we die because vm_ops.map_resources is NULL.
      
      Fixing this restores QEMU's kernel-irqchip=off option to a working state,
      allowing to use GIC emulation in userspace.
      
      Fixes: b26e5fda ("arm/arm64: KVM: introduce per-VM ops")
      Signed-off-by: default avatarPavel Fedin <p.fedin@samsung.com>
      [maz: reworked commit message]
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      669c12b9
    • David Woodhouse's avatar
      x86/platform: Fix Geode LX timekeeping in the generic x86 build · 7e9a28a8
      David Woodhouse authored
      commit 03da3ff1 upstream.
      
      In 2007, commit 07190a08 ("Mark TSC on GeodeLX reliable")
      bypassed verification of the TSC on Geode LX. However, this code
      (now in the check_system_tsc_reliable() function in
      arch/x86/kernel/tsc.c) was only present if CONFIG_MGEODE_LX was
      set.
      
      OpenWRT has recently started building its generic Geode target
      for Geode GX, not LX, to include support for additional
      platforms. This broke the timekeeping on LX-based devices,
      because the TSC wasn't marked as reliable:
      https://dev.openwrt.org/ticket/20531
      
      By adding a runtime check on is_geode_lx(), we can also include
      the fix if CONFIG_MGEODEGX1 or CONFIG_X86_GENERIC are set, thus
      fixing the problem.
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Cc: Andres Salomon <dilinger@queued.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Marcelo Tosatti <marcelo@kvack.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1442409003.131189.87.camel@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7e9a28a8
    • Marek Majtyka's avatar
      arm: KVM: Fix incorrect device to IPA mapping · 618a9728
      Marek Majtyka authored
      commit ca09f02f upstream.
      
      A critical bug has been found in device memory stage1 translation for
      VMs with more then 4GB of address space. Once vm_pgoff size is smaller
      then pa (which is true for LPAE case, u32 and u64 respectively) some
      more significant bits of pa may be lost as a shift operation is performed
      on u32 and later cast onto u64.
      
      Example: vm_pgoff(u32)=0x00210030, PAGE_SHIFT=12
              expected pa(u64):   0x0000002010030000
              produced pa(u64):   0x0000000010030000
      
      The fix is to change the order of operations (casting first onto phys_addr_t
      and then shifting).
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      [maz: fixed changelog and patch formatting]
      Signed-off-by: default avatarMarek Majtyka <marek.majtyka@tieto.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      618a9728
    • Wanpeng Li's avatar
      KVM: vmx: fix VPID is 0000H in non-root operation · 7fb7dfcb
      Wanpeng Li authored
      commit 04bb92e4 upstream.
      
      Reference SDM 28.1:
      
      The current VPID is 0000H in the following situations:
      - Outside VMX operation. (This includes operation in system-management
        mode under the default treatment of SMIs and SMM with VMX operation;
        see Section 34.14.)
      - In VMX root operation.
      - In VMX non-root operation when the “enable VPID” VM-execution control
        is 0.
      
      The VPID should never be 0000H in non-root operation when "enable VPID"
      VM-execution control is 1. However, commit 34a1cd60 ("kvm: x86: vmx:
      move some vmx setting from vmx_init() to hardware_setup()") remove the
      codes which reserve 0000H for VMX root operation.
      
      This patch fix it by again reserving 0000H for VMX root operation.
      
      Fixes: 34a1cd60Reported-by: default avatarWincy Van <fanwenyi0529@gmail.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7fb7dfcb
    • Aneesh Kumar K.V's avatar
      powerpc/mm: Recompute hash value after a failed update · aaef98f9
      Aneesh Kumar K.V authored
      commit 36b35d5d upstream.
      
      If we had secondary hash flag set, we ended up modifying hash value in
      the updatepp code path. Hence with a failed updatepp we will be using
      a wrong hash value for the following hash insert. Fix this by
      recomputing hash before insert.
      
      Without this patch we can end up with using wrong slot number in linux
      pte. That can result in us missing an hash pte update or invalidate
      which can cause memory corruption or even machine check.
      
      Fixes: 6d492ecc ("powerpc/THP: Add code to handle HPTE faults for hugepages")
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      aaef98f9
    • Benjamin Herrenschmidt's avatar
      powerpc/boot: Specify ABI v2 when building an LE boot wrapper · 885c569d
      Benjamin Herrenschmidt authored
      commit 655471f5 upstream.
      
      The kernel does it, not the boot wrapper, which breaks with some
      cross compilers that still default to ABI v1.
      
      Fixes: 147c0516 ("powerpc/boot: Add support for 64bit little endian wrapper")
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      885c569d
    • Russell King's avatar
      ARM: fix Thumb2 signal handling when ARMv6 is enabled · 2a823291
      Russell King authored
      commit 9b55613f upstream.
      
      When a kernel is built covering ARMv6 to ARMv7, we omit to clear the
      IT state when entering a signal handler.  This can cause the first
      few instructions to be conditionally executed depending on the parent
      context.
      
      In any case, the original test for >= ARMv7 is broken - ARMv6 can have
      Thumb-2 support as well, and an ARMv6T2 specific build would omit this
      code too.
      
      Relax the test back to ARMv6 or greater.  This results in us always
      clearing the IT state bits in the PSR, even on CPUs where these bits
      are reserved.  However, they're reserved for the IT state, so this
      should cause no harm.
      
      Fixes: d71e1352 ("Clear the IT state when invoking a Thumb-2 signal handler")
      Acked-by: default avatarTony Lindgren <tony@atomide.com>
      Tested-by: default avatarH. Nikolaus Schaller <hns@goldelico.com>
      Tested-by: default avatarGrazvydas Ignotas <notasas@gmail.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      2a823291
    • Jenny Derzhavetz's avatar
      iser-target: Put the reference on commands waiting for unsol data · 3080ce45
      Jenny Derzhavetz authored
      commit 3e03c4b0 upstream.
      
      The iscsi target core teardown sequence calls wait_conn for
      all active commands to finish gracefully by:
      - move the queue-pair to error state
      - drain all the completions
      - wait for the core to finish handling all session commands
      
      However, when tearing down a session while there are sequenced
      commands that are still waiting for unsolicited data outs, we can
      block forever as these are missing an extra reference put.
      
      We basically need the equivalent of iscsit_free_queue_reqs_for_conn()
      which is called after wait_conn has returned. Address this by an
      explicit walk on conn_cmd_list and put the extra reference.
      Signed-off-by: default avatarJenny Derzhavetz <jennyf@mellanox.com>
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3080ce45
    • Jenny Derzhavetz's avatar
      iser-target: remove command with state ISTATE_REMOVE · 6f2c24fe
      Jenny Derzhavetz authored
      commit a4c15cd9 upstream.
      
      As documented in iscsit_sequence_cmd:
      /*
       * Existing callers for iscsit_sequence_cmd() will silently
       * ignore commands with CMDSN_LOWER_THAN_EXP, so force this
       * return for CMDSN_MAXCMDSN_OVERRUN as well..
       */
      
      We need to silently finish a command when it's in ISTATE_REMOVE.
      This fixes an teardown hang we were seeing where a mis-behaved
      initiator (triggered by allocation error injections) sent us a
      cmdsn which was lower than expected.
      Signed-off-by: default avatarJenny Derzhavetz <jennyf@mellanox.com>
      Signed-off-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6f2c24fe
    • Simon Guinot's avatar
      net: mvneta: fix DMA buffer unmapping in mvneta_rx() · 1f544f65
      Simon Guinot authored
      commit daf158d0 upstream.
      
      This patch fixes a regression introduced by the commit a84e3289
      ("net: mvneta: fix refilling for Rx DMA buffers"). Due to this commit
      the newly allocated Rx buffers are DMA-unmapped in place of those passed
      to the networking stack. Obviously, this causes data corruptions.
      
      This patch fixes the issue by ensuring that the right Rx buffers are
      DMA-unmapped.
      Reported-by: default avatarOren Laskin <oren@igneous.io>
      Signed-off-by: default avatarSimon Guinot <simon.guinot@sequanux.org>
      Fixes: a84e3289 ("net: mvneta: fix refilling for Rx DMA buffers")
      Tested-by: default avatarOren Laskin <oren@igneous.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1f544f65
    • Jason Wang's avatar
      kvm: fix zero length mmio searching · 905db2ff
      Jason Wang authored
      commit 8f4216c7 upstream.
      
      Currently, if we had a zero length mmio eventfd assigned on
      KVM_MMIO_BUS. It will never be found by kvm_io_bus_cmp() since it
      always compares the kvm_io_range() with the length that guest
      wrote. This will cause e.g for vhost, kick will be trapped by qemu
      userspace instead of vhost. Fixing this by using zero length if an
      iodevice is zero length.
      
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      905db2ff
    • Jason Wang's avatar
      kvm: fix double free for fast mmio eventfd · c0c29106
      Jason Wang authored
      commit eefd6b06 upstream.
      
      We register wildcard mmio eventfd on two buses, once for KVM_MMIO_BUS
      and once on KVM_FAST_MMIO_BUS but with a single iodev
      instance. This will lead to an issue: kvm_io_bus_destroy() knows
      nothing about the devices on two buses pointing to a single dev. Which
      will lead to double free[1] during exit. Fix this by allocating two
      instances of iodevs then registering one on KVM_MMIO_BUS and another
      on KVM_FAST_MMIO_BUS.
      
      CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
      Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
      task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
      RIP: 0010:[<ffffffffc07e25d8>]  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
      RSP: 0018:ffff88020e7f3bc8  EFLAGS: 00010292
      RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
      RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
      RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
      R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
      R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
      FS:  00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
      Stack:
      ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
      ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
       0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
      Call Trace:
      [<ffffffffc07e2622>] ioeventfd_destructor+0x12/0x20 [kvm]
      [<ffffffffc07df69a>] kvm_put_kvm+0xca/0x210 [kvm]
      [<ffffffffc07df818>] kvm_vcpu_release+0x18/0x20 [kvm]
      [<ffffffff811f69f7>] __fput+0xe7/0x250
      [<ffffffff811f6bae>] ____fput+0xe/0x10
      [<ffffffff81093f04>] task_work_run+0xd4/0xf0
      [<ffffffff81079358>] do_exit+0x368/0xa50
      [<ffffffff81082c8f>] ? recalc_sigpending+0x1f/0x60
      [<ffffffff81079ad5>] do_group_exit+0x45/0xb0
      [<ffffffff81085c71>] get_signal+0x291/0x750
      [<ffffffff810144d8>] do_signal+0x28/0xab0
      [<ffffffff810f3a3b>] ? do_futex+0xdb/0x5d0
      [<ffffffff810b7028>] ? __wake_up_locked_key+0x18/0x20
      [<ffffffff810f3fa6>] ? SyS_futex+0x76/0x170
      [<ffffffff81014fc9>] do_notify_resume+0x69/0xb0
      [<ffffffff817cb9af>] int_signal+0x12/0x17
      Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00
       RIP  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
       RSP <ffff88020e7f3bc8>
      
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c0c29106
    • Jason Wang's avatar
      kvm: factor out core eventfd assign/deassign logic · b86d4a9e
      Jason Wang authored
      commit 85da11ca upstream.
      
      This patch factors out core eventfd assign/deassign logic and leaves
      the argument checking and bus index selection to callers.
      
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b86d4a9e
    • Jason Wang's avatar
      kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd · c4e74f99
      Jason Wang authored
      commit 8453fecb upstream.
      
      We only want zero length mmio eventfd to be registered on
      KVM_FAST_MMIO_BUS. So check this explicitly when arg->len is zero to
      make sure this.
      
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarCornelia Huck <cornelia.huck@de.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c4e74f99
    • Will Deacon's avatar
      arm64: head.S: initialise mdcr_el2 in el2_setup · 0d658d98
      Will Deacon authored
      commit d10bcd47 upstream.
      
      When entering the kernel at EL2, we fail to initialise the MDCR_EL2
      register which controls debug access and PMU capabilities at EL1.
      
      This patch ensures that the register is initialised so that all traps
      are disabled and all the PMU counters are available to the host. When a
      guest is scheduled, KVM takes care to configure trapping appropriately.
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      0d658d98
    • Daniel Axtens's avatar
      cxl: Fix unbalanced pci_dev_get in cxl_probe · 62414d49
      Daniel Axtens authored
      commit 2925c2fd upstream.
      
      Currently the first thing we do in cxl_probe is to grab a reference
      on the pci device. Later on, we call device_register on our adapter.
      In our remove path, we call device_unregister, but we never call
      pci_dev_put. We therefore leak the device every time we do a
      reflash.
      
      device_register/unregister is sufficient to hold the reference.
      Therefore, drop the call to pci_dev_get.
      
      Here's why this is safe.
      The proposed cxl_probe(pdev) calls cxl_adapter_init:
          a) init calls cxl_adapter_alloc, which creates a struct cxl,
             conventionally called adapter. This struct contains a
             device entry, adapter->dev.
      
          b) init calls cxl_configure_adapter, where we set
             adapter->dev.parent = &dev->dev (here dev is the pci dev)
      
      So at this point, the cxl adapter's device's parent is the PCI
      device that I want to be refcounted properly.
      
          c) init calls cxl_register_adapter
             *) cxl_register_adapter calls device_register(&adapter->dev)
      
      So now we're in device_register, where dev is the adapter device, and
      we want to know if the PCI device is safe after we return.
      
      device_register(&adapter->dev) calls device_initialize() and then
      device_add().
      
      device_add() does a get_device(). device_add() also explicitly grabs
      the device's parent, and calls get_device() on it:
      
               parent = get_device(dev->parent);
      
      So therefore, device_register() takes a lock on the parent PCI dev,
      which is what pci_dev_get() was guarding. pci_dev_get() can therefore
      be safely removed.
      
      Fixes: f204e0b8 ("cxl: Driver code for powernv PCIe based cards for userspace access")
      Signed-off-by: default avatarDaniel Axtens <dja@axtens.net>
      Acked-by: default avatarIan Munsie <imunsie@au1.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      62414d49
    • Peter Chen's avatar
      usb: chipidea: udc: using the correct stall implementation · 7c1a9e59
      Peter Chen authored
      commit 56ffa1d1 upstream.
      
      According to spec, there are functional and protocol stalls.
      
      For functional stall, it is for bulk and interrupt endpoints,
      below are cases for it:
      - Host sends SET_FEATURE request for Set-Halt, the udc driver
      needs to set stall, and return true unconditionally.
      - The gadget driver may call usb_ep_set_halt to stall certain
      endpoints, if there is a transfer in pending, the udc driver
      should not set stall, and return -EAGAIN accordingly.
      These two kinds of stall need to be cleared by host using CLEAR_FEATURE
      request (Clear-Halt).
      
      For protocol stall, it is for control endpoint, this stall will
      be set if the control request has failed. This stall will be
      cleared by next setup request (hardware will do it).
      
      It fixed usbtest (drivers/usb/misc/usbtest.c) Test 13 "set/clear halt"
      test failure, meanwhile, this change has been verified by
      USB2 CV Compliance Test and MSC Tests.
      
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Felipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarPeter Chen <peter.chen@freescale.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7c1a9e59
    • Jeff Mahoney's avatar
      btrfs: skip waiting on ordered range for special files · a6d10b34
      Jeff Mahoney authored
      commit a30e577c upstream.
      
      In btrfs_evict_inode, we properly truncate the page cache for evicted
      inodes but then we call btrfs_wait_ordered_range for every inode as well.
      It's the right thing to do for regular files but results in incorrect
      behavior for device inodes for block devices.
      
      filemap_fdatawrite_range gets called with inode->i_mapping which gets
      resolved to the block device inode before getting passed to
      wbc_attach_fdatawrite_inode and ultimately to inode_to_bdi.  What happens
      next depends on whether there's an open file handle associated with the
      inode.  If there is, we write to the block device, which is unexpected
      behavior.  If there isn't, we through normally and inode->i_data is used.
      We can also end up racing against open/close which can result in crashes
      when i_mapping points to a block device inode that has been closed.
      
      Since there can't be any page cache associated with special file inodes,
      it's safe to skip the btrfs_wait_ordered_range call entirely and avoid
      the problem.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=100911Tested-by: default avatarChristoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a6d10b34
    • Filipe Manana's avatar
      Btrfs: fix read corruption of compressed and shared extents · 46e971cd
      Filipe Manana authored
      commit 005efedf upstream.
      
      If a file has a range pointing to a compressed extent, followed by
      another range that points to the same compressed extent and a read
      operation attempts to read both ranges (either completely or part of
      them), the pages that correspond to the second range are incorrectly
      filled with zeroes.
      
      Consider the following example:
      
        File layout
        [0 - 8K]                      [8K - 24K]
            |                             |
            |                             |
         points to extent X,         points to extent X,
         offset 4K, length of 8K     offset 0, length 16K
      
        [extent X, compressed length = 4K uncompressed length = 16K]
      
      If a readpages() call spans the 2 ranges, a single bio to read the extent
      is submitted - extent_io.c:submit_extent_page() would only create a new
      bio to cover the second range pointing to the extent if the extent it
      points to had a different logical address than the extent associated with
      the first range. This has a consequence of the compressed read end io
      handler (compression.c:end_compressed_bio_read()) finish once the extent
      is decompressed into the pages covering the first range, leaving the
      remaining pages (belonging to the second range) filled with zeroes (done
      by compression.c:btrfs_clear_biovec_end()).
      
      So fix this by submitting the current bio whenever we find a range
      pointing to a compressed extent that was preceded by a range with a
      different extent map. This is the simplest solution for this corner
      case. Making the end io callback populate both ranges (or more, if we
      have multiple pointing to the same extent) is a much more complex
      solution since each bio is tightly coupled with a single extent map and
      the extent maps associated to the ranges pointing to the shared extent
      can have different offsets and lengths.
      
      The following test case for fstests triggers the issue:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _need_to_be_root
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _require_cloner
      
        rm -f $seqres.full
      
        test_clone_and_read_compressed_extent()
        {
            local mount_opts=$1
      
            _scratch_mkfs >>$seqres.full 2>&1
            _scratch_mount $mount_opts
      
            # Create a test file with a single extent that is compressed (the
            # data we write into it is highly compressible no matter which
            # compression algorithm is used, zlib or lzo).
            $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 4K"        \
                            -c "pwrite -S 0xbb 4K 8K"        \
                            -c "pwrite -S 0xcc 12K 4K"       \
                            $SCRATCH_MNT/foo | _filter_xfs_io
      
            # Now clone our extent into an adjacent offset.
            $CLONER_PROG -s $((4 * 1024)) -d $((16 * 1024)) -l $((8 * 1024)) \
                $SCRATCH_MNT/foo $SCRATCH_MNT/foo
      
            # Same as before but for this file we clone the extent into a lower
            # file offset.
            $XFS_IO_PROG -f -c "pwrite -S 0xaa 8K 4K"         \
                            -c "pwrite -S 0xbb 12K 8K"        \
                            -c "pwrite -S 0xcc 20K 4K"        \
                            $SCRATCH_MNT/bar | _filter_xfs_io
      
            $CLONER_PROG -s $((12 * 1024)) -d 0 -l $((8 * 1024)) \
                $SCRATCH_MNT/bar $SCRATCH_MNT/bar
      
            echo "File digests before unmounting filesystem:"
            md5sum $SCRATCH_MNT/foo | _filter_scratch
            md5sum $SCRATCH_MNT/bar | _filter_scratch
      
            # Evicting the inode or clearing the page cache before reading
            # again the file would also trigger the bug - reads were returning
            # all bytes in the range corresponding to the second reference to
            # the extent with a value of 0, but the correct data was persisted
            # (it was a bug exclusively in the read path). The issue happened
            # only if the same readpages() call targeted pages belonging to the
            # first and second ranges that point to the same compressed extent.
            _scratch_remount
      
            echo "File digests after mounting filesystem again:"
            # Must match the same digests we got before.
            md5sum $SCRATCH_MNT/foo | _filter_scratch
            md5sum $SCRATCH_MNT/bar | _filter_scratch
        }
      
        echo -e "\nTesting with zlib compression..."
        test_clone_and_read_compressed_extent "-o compress=zlib"
      
        _scratch_unmount
      
        echo -e "\nTesting with lzo compression..."
        test_clone_and_read_compressed_extent "-o compress=lzo"
      
        status=0
        exit
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: Qu Wenruo<quwenruo@cn.fujitsu.com>
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      46e971cd
    • Carl Frederik Werner's avatar
      ARM: dts: omap3-beagle: make i2c3, ddc and tfp410 gpio work again · 3571c540
      Carl Frederik Werner authored
      commit 3a2fa775 upstream.
      
      Let's fix pinmux address of gpio 170 used by tfp410 powerdown-gpio.
      
      According to the OMAP35x Technical Reference Manual
        CONTROL_PADCONF_I2C3_SDA[15:0]  0x480021C4 mode0: i2c3_sda
        CONTROL_PADCONF_I2C3_SDA[31:16] 0x480021C4 mode4: gpio_170
      the pinmux address of gpio 170 must be 0x480021C6.
      
      The former wrong address broke i2c3 (used by hdmi ddc), resulting in
      kernel message:
        omap_i2c 48060000.i2c: controller timed out
      
      Fixes: 8cecf52b ("ARM: omap3-beagle.dts: add display information")
      Signed-off-by: default avatarCarl Frederik Werner <frederik@cfbw.eu>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3571c540
    • Shaohua Li's avatar
      x86/apic: Serialize LVTT and TSC_DEADLINE writes · ac366408
      Shaohua Li authored
      commit 5d7c631d upstream.
      
      The APIC LVTT register is MMIO mapped but the TSC_DEADLINE register is an
      MSR. The write to the TSC_DEADLINE MSR is not serializing, so it's not
      guaranteed that the write to LVTT has reached the APIC before the
      TSC_DEADLINE MSR is written. In such a case the write to the MSR is
      ignored and as a consequence the local timer interrupt never fires.
      
      The SDM decribes this issue for xAPIC and x2APIC modes. The
      serialization methods recommended by the SDM differ.
      
      xAPIC:
       "1. Memory-mapped write to LVT Timer Register, setting bits 18:17 to 10b.
        2. WRMSR to the IA32_TSC_DEADLINE MSR a value much larger than current time-stamp counter.
        3. If RDMSR of the IA32_TSC_DEADLINE MSR returns zero, go to step 2.
        4. WRMSR to the IA32_TSC_DEADLINE MSR the desired deadline."
      
      x2APIC:
       "To allow for efficient access to the APIC registers in x2APIC mode,
        the serializing semantics of WRMSR are relaxed when writing to the
        APIC registers. Thus, system software should not use 'WRMSR to APIC
        registers in x2APIC mode' as a serializing instruction. Read and write
        accesses to the APIC registers will occur in program order. A WRMSR to
        an APIC register may complete before all preceding stores are globally
        visible; software can prevent this by inserting a serializing
        instruction, an SFENCE, or an MFENCE before the WRMSR."
      
      The xAPIC method is to just wait for the memory mapped write to hit
      the LVTT by checking whether the MSR write has reached the hardware.
      There is no reason why a proper MFENCE after the memory mapped write would
      not do the same. Andi Kleen confirmed that MFENCE is sufficient for the
      xAPIC case as well.
      
      Issue MFENCE before writing to the TSC_DEADLINE MSR. This can be done
      unconditionally as all CPUs which have TSC_DEADLINE also have MFENCE
      support.
      
      [ tglx: Massaged the changelog ]
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Reviewed-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: <Kernel-team@fb.com>
      Cc: <lenb@kernel.org>
      Cc: <fenghua.yu@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Link: http://lkml.kernel.org/r/20150909041352.GA2059853@devbig257.prn2.facebook.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ac366408