1. 01 Feb, 2021 2 commits
    • Yi Chen's avatar
      f2fs: fix to avoid inconsistent quota data · 25fb04db
      Yi Chen authored
      Occasionally, quota data may be corrupted detected by fsck:
      
      Info: checkpoint state = 45 :  crc compacted_summary unmount
      [QUOTA WARNING] Usage inconsistent for ID 0:actual (1543036928, 762) != expected (1543032832, 762)
      [ASSERT] (fsck_chk_quota_files:1986)  --> Quota file is missing or invalid quota file content found.
      [QUOTA WARNING] Usage inconsistent for ID 0:actual (1352478720, 344) != expected (1352474624, 344)
      [ASSERT] (fsck_chk_quota_files:1986)  --> Quota file is missing or invalid quota file content found.
      
      [FSCK] Unreachable nat entries                        [Ok..] [0x0]
      [FSCK] SIT valid block bitmap checking                [Ok..]
      [FSCK] Hard link checking for regular file            [Ok..] [0x0]
      [FSCK] valid_block_count matching with CP             [Ok..] [0xdf299]
      [FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0x2b01]
      [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0x2b01]
      [FSCK] valid_inode_count matched with CP              [Ok..] [0x2665]
      [FSCK] free segment_count matched with CP             [Ok..] [0xcb04]
      [FSCK] next block offset is free                      [Ok..]
      [FSCK] fixing SIT types
      [FSCK] other corrupted bugs                           [Fail]
      
      The root cause is:
      If we open file w/ readonly flag, disk quota info won't be initialized
      for this file, however, following mmap() will force to convert inline
      inode via f2fs_convert_inline_inode(), which may increase block usage
      for this inode w/o updating quota data, it causes inconsistent disk quota
      info.
      
      The issue will happen in following stack:
      open(file, O_RDONLY)
      mmap(file)
      - f2fs_convert_inline_inode
       - f2fs_convert_inline_page
        - f2fs_reserve_block
         - f2fs_reserve_new_block
          - f2fs_reserve_new_blocks
           - f2fs_i_blocks_write
            - dquot_claim_block
      inode->i_blocks increase, but the dqb_curspace keep the size for the dquots
      is NULL.
      
      To fix this issue, let's call dquot_initialize() anyway in both
      f2fs_truncate() and f2fs_convert_inline_inode() functions to avoid potential
      inconsistent quota data issue.
      
      Fixes: 0abd675e ("f2fs: support plain user/group quota")
      Signed-off-by: default avatarDaiyue Zhang <zhangdaiyue1@huawei.com>
      Signed-off-by: default avatarDehe Gu <gudehe@huawei.com>
      Signed-off-by: default avatarJunchao Jiang <jiangjunchao1@huawei.com>
      Signed-off-by: default avatarGe Qiu <qiuge@huawei.com>
      Signed-off-by: default avatarYi Chen <chenyi77@huawei.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      25fb04db
    • Jaegeuk Kim's avatar
      f2fs: flush data when enabling checkpoint back · b0ff4fe7
      Jaegeuk Kim authored
      During checkpoint=disable period, f2fs bypasses all the synchronous IOs such as
      sync and fsync. So, when enabling it back, we must flush all of them in order
      to keep the data persistent. Otherwise, suddern power-cut right after enabling
      checkpoint will cause data loss.
      
      Fixes: 4354994f ("f2fs: checkpoint disabling")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b0ff4fe7
  2. 27 Jan, 2021 23 commits
  3. 26 Jan, 2021 7 commits
  4. 25 Jan, 2021 8 commits
    • Paolo Bonzini's avatar
      KVM: x86: allow KVM_REQ_GET_NESTED_STATE_PAGES outside guest mode for VMX · 9a78e158
      Paolo Bonzini authored
      VMX also uses KVM_REQ_GET_NESTED_STATE_PAGES for the Hyper-V eVMCS,
      which may need to be loaded outside guest mode.  Therefore we cannot
      WARN in that case.
      
      However, that part of nested_get_vmcs12_pages is _not_ needed at
      vmentry time.  Split it out of KVM_REQ_GET_NESTED_STATE_PAGES handling,
      so that both vmentry and migration (and in the latter case, independent
      of is_guest_mode) do the parts that are needed.
      
      Cc: <stable@vger.kernel.org> # 5.10.x: f2c7ef3b: KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES
      Cc: <stable@vger.kernel.org> # 5.10.x
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9a78e158
    • Sean Christopherson's avatar
      KVM: x86: Revert "KVM: x86: Mark GPRs dirty when written" · aed89418
      Sean Christopherson authored
      Revert the dirty/available tracking of GPRs now that KVM copies the GPRs
      to the GHCB on any post-VMGEXIT VMRUN, even if a GPR is not dirty.  Per
      commit de3cd117 ("KVM: x86: Omit caching logic for always-available
      GPRs"), tracking for GPRs noticeably impacts KVM's code footprint.
      
      This reverts commit 1c04d8c9.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20210122235049.3107620-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      aed89418
    • Sean Christopherson's avatar
      KVM: SVM: Unconditionally sync GPRs to GHCB on VMRUN of SEV-ES guest · 25009140
      Sean Christopherson authored
      Drop the per-GPR dirty checks when synchronizing GPRs to the GHCB, the
      GRPs' dirty bits are set from time zero and never cleared, i.e. will
      always be seen as dirty.  The obvious alternative would be to clear
      the dirty bits when appropriate, but removing the dirty checks is
      desirable as it allows reverting GPR dirty+available tracking, which
      adds overhead to all flavors of x86 VMs.
      
      Note, unconditionally writing the GPRs in the GHCB is tacitly allowed
      by the GHCB spec, which allows the hypervisor (or guest) to provide
      unnecessary info; it's the guest's responsibility to consume only what
      it needs (the hypervisor is untrusted after all).
      
        The guest and hypervisor can supply additional state if desired but
        must not rely on that additional state being provided.
      
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Fixes: 291bd20d ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20210122235049.3107620-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      25009140
    • Maxim Levitsky's avatar
      KVM: nVMX: Sync unsync'd vmcs02 state to vmcs12 on migration · d51e1d3f
      Maxim Levitsky authored
      Even when we are outside the nested guest, some vmcs02 fields
      may not be in sync vs vmcs12.  This is intentional, even across
      nested VM-exit, because the sync can be delayed until the nested
      hypervisor performs a VMCLEAR or a VMREAD/VMWRITE that affects those
      rarely accessed fields.
      
      However, during KVM_GET_NESTED_STATE, the vmcs12 has to be up to date to
      be able to restore it.  To fix that, call copy_vmcs02_to_vmcs12_rare()
      before the vmcs12 contents are copied to userspace.
      
      Fixes: 7952d769 ("KVM: nVMX: Sync rarely accessed guest fields only when needed")
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210114205449.8715-2-mlevitsk@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d51e1d3f
    • Lorenzo Brescia's avatar
      kvm: tracing: Fix unmatched kvm_entry and kvm_exit events · d95df951
      Lorenzo Brescia authored
      On VMX, if we exit and then re-enter immediately without leaving
      the vmx_vcpu_run() function, the kvm_entry event is not logged.
      That means we will see one (or more) kvm_exit, without its (their)
      corresponding kvm_entry, as shown here:
      
       CPU-1979 [002] 89.871187: kvm_entry: vcpu 1
       CPU-1979 [002] 89.871218: kvm_exit:  reason MSR_WRITE
       CPU-1979 [002] 89.871259: kvm_exit:  reason MSR_WRITE
      
      It also seems possible for a kvm_entry event to be logged, but then
      we leave vmx_vcpu_run() right away (if vmx->emulation_required is
      true). In this case, we will have a spurious kvm_entry event in the
      trace.
      
      Fix these situations by moving trace_kvm_entry() inside vmx_vcpu_run()
      (where trace_kvm_exit() already is).
      
      A trace obtained with this patch applied looks like this:
      
       CPU-14295 [000] 8388.395387: kvm_entry: vcpu 0
       CPU-14295 [000] 8388.395392: kvm_exit:  reason MSR_WRITE
       CPU-14295 [000] 8388.395393: kvm_entry: vcpu 0
       CPU-14295 [000] 8388.395503: kvm_exit:  reason EXTERNAL_INTERRUPT
      
      Of course, not calling trace_kvm_entry() in common x86 code any
      longer means that we need to adjust the SVM side of things too.
      Signed-off-by: default avatarLorenzo Brescia <lorenzo.brescia@edu.unito.it>
      Signed-off-by: default avatarDario Faggioli <dfaggioli@suse.com>
      Message-Id: <160873470698.11652.13483635328769030605.stgit@Wayrath>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d95df951
    • Zenghui Yu's avatar
      KVM: Documentation: Update description of KVM_{GET,CLEAR}_DIRTY_LOG · 01ead84c
      Zenghui Yu authored
      Update various words, including the wrong parameter name and the vague
      description of the usage of "slot" field.
      Signed-off-by: default avatarZenghui Yu <yuzenghui@huawei.com>
      Message-Id: <20201208043439.895-1-yuzenghui@huawei.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      01ead84c
    • Jay Zhou's avatar
      KVM: x86: get smi pending status correctly · 1f7becf1
      Jay Zhou authored
      The injection process of smi has two steps:
      
          Qemu                        KVM
      Step1:
          cpu->interrupt_request &= \
              ~CPU_INTERRUPT_SMI;
          kvm_vcpu_ioctl(cpu, KVM_SMI)
      
                                      call kvm_vcpu_ioctl_smi() and
                                      kvm_make_request(KVM_REQ_SMI, vcpu);
      
      Step2:
          kvm_vcpu_ioctl(cpu, KVM_RUN, 0)
      
                                      call process_smi() if
                                      kvm_check_request(KVM_REQ_SMI, vcpu) is
                                      true, mark vcpu->arch.smi_pending = true;
      
      The vcpu->arch.smi_pending will be set true in step2, unfortunately if
      vcpu paused between step1 and step2, the kvm_run->immediate_exit will be
      set and vcpu has to exit to Qemu immediately during step2 before mark
      vcpu->arch.smi_pending true.
      During VM migration, Qemu will get the smi pending status from KVM using
      KVM_GET_VCPU_EVENTS ioctl at the downtime, then the smi pending status
      will be lost.
      Signed-off-by: default avatarJay Zhou <jianjay.zhou@huawei.com>
      Signed-off-by: default avatarShengen Zhuang <zhuangshengen@huawei.com>
      Message-Id: <20210118084720.1585-1-jianjay.zhou@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1f7becf1
    • Like Xu's avatar
      KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in intel_arch_events[] · 98dd2f10
      Like Xu authored
      The HW_REF_CPU_CYCLES event on the fixed counter 2 is pseudo-encoded as
      0x0300 in the intel_perfmon_event_map[]. Correct its usage.
      
      Fixes: 62079d8a ("KVM: PMU: add proper support for fixed counter 2")
      Signed-off-by: default avatarLike Xu <like.xu@linux.intel.com>
      Message-Id: <20201230081916.63417-1-like.xu@linux.intel.com>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      98dd2f10