1. 11 Feb, 2021 3 commits
    • Satya Tangirala's avatar
      block/keyslot-manager: Introduce functions for device mapper support · d3b17a24
      Satya Tangirala authored
      Introduce blk_ksm_update_capabilities() to update the capabilities of
      a keyslot manager (ksm) in-place. The pointer to a ksm in a device's
      request queue may not be easily replaced, because upper layers like
      the filesystem might access it (e.g. for programming keys/checking
      capabilities) at the same time the device wants to replace that
      request queue's ksm (and free the old ksm's memory). This function
      allows the device to update the capabilities of the ksm in its request
      queue directly. Devices can safely update the ksm this way without any
      synchronization with upper layers *only* if the updated (new) ksm
      continues to support all the crypto capabilities that the old ksm did
      (see description below for blk_ksm_is_superset() for why this is so).
      
      Also introduce blk_ksm_is_superset() which checks whether one ksm's
      capabilities are a (not necessarily strict) superset of another ksm's.
      The blk-crypto framework requires that crypto capabilities that were
      advertised when a bio was created continue to be supported by the
      device until that bio is ended - in practice this probably means that
      a device's advertised crypto capabilities can *never* "shrink" (since
      there's no synchronization between bio creation and when a device may
      want to change its advertised capabilities) - so a previously
      advertised crypto capability must always continue to be supported.
      This function can be used to check that a new ksm is a valid
      replacement for an old ksm.
      Signed-off-by: default avatarSatya Tangirala <satyat@google.com>
      Reviewed-by: default avatarEric Biggers <ebiggers@google.com>
      Acked-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      d3b17a24
    • Satya Tangirala's avatar
      block/keyslot-manager: Introduce passthrough keyslot manager · 7bdcc48f
      Satya Tangirala authored
      The device mapper may map over devices that have inline encryption
      capabilities, and to make use of those capabilities, the DM device must
      itself advertise those inline encryption capabilities. One way to do this
      would be to have the DM device set up a keyslot manager with a
      "sufficiently large" number of keyslots, but that would use a lot of
      memory. Also, the DM device itself has no "keyslots", and it doesn't make
      much sense to talk about "programming a key into a DM device's keyslot
      manager", so all that extra memory used to represent those keyslots is just
      wasted. All a DM device really needs to be able to do is advertise the
      crypto capabilities of the underlying devices in a coherent manner and
      expose a way to evict keys from the underlying devices.
      
      There are also devices with inline encryption hardware that do not
      have a limited number of keyslots. One can send a raw encryption key along
      with a bio to these devices (as opposed to typical inline encryption
      hardware that require users to first program a raw encryption key into a
      keyslot, and send the index of that keyslot along with the bio). These
      devices also only need the same things from the keyslot manager that DM
      devices need - a way to advertise crypto capabilities and potentially a way
      to expose a function to evict keys from hardware.
      
      So we introduce a "passthrough" keyslot manager that provides a way to
      represent a keyslot manager that doesn't have just a limited number of
      keyslots, and for which do not require keys to be programmed into keyslots.
      DM devices can set up a passthrough keyslot manager in their request
      queues, and advertise appropriate crypto capabilities based on those of the
      underlying devices. Blk-crypto does not attempt to program keys into any
      keyslots in the passthrough keyslot manager. Instead, if/when the bio is
      resubmitted to the underlying device, blk-crypto will try to program the
      key into the underlying device's keyslot manager.
      Signed-off-by: default avatarSatya Tangirala <satyat@google.com>
      Reviewed-by: default avatarEric Biggers <ebiggers@google.com>
      Acked-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      7bdcc48f
    • Nikos Tsironis's avatar
      dm era: only resize metadata in preresume · cca2c6ae
      Nikos Tsironis authored
      Metadata resize shouldn't happen in the ctr. The ctr loads a temporary
      (inactive) table that will only become active upon resume. That is why
      resize should always be done in terms of resume. Otherwise a load (ctr)
      whose inactive table never becomes active will incorrectly resize the
      metadata.
      
      Also, perform the resize directly in preresume, instead of using the
      worker to do it.
      
      The worker might run other metadata operations, e.g., it could start
      digestion, before resizing the metadata. These operations will end up
      using the old size.
      
      This could lead to errors, like:
      
        device-mapper: era: metadata_digest_transcribe_writeset: dm_array_set_value failed
        device-mapper: era: process_old_eras: digest step failed, stopping digestion
      
      The reason of the above error is that the worker started the digestion
      of the archived writeset using the old, larger size.
      
      As a result, metadata_digest_transcribe_writeset tried to write beyond
      the end of the era array.
      
      Fixes: eec40579 ("dm: add era target")
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      cca2c6ae
  2. 10 Feb, 2021 6 commits
    • Nikos Tsironis's avatar
      dm era: Use correct value size in equality function of writeset tree · 64f2d15a
      Nikos Tsironis authored
      Fix the writeset tree equality test function to use the right value size
      when comparing two btree values.
      
      Fixes: eec40579 ("dm: add era target")
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Reviewed-by: default avatarMing-Hung Tsai <mtsai@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      64f2d15a
    • Nikos Tsironis's avatar
      dm era: Fix bitset memory leaks · 904e6b26
      Nikos Tsironis authored
      Deallocate the memory allocated for the in-core bitsets when destroying
      the target and in error paths.
      
      Fixes: eec40579 ("dm: add era target")
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Reviewed-by: default avatarMing-Hung Tsai <mtsai@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      904e6b26
    • Nikos Tsironis's avatar
      dm era: Verify the data block size hasn't changed · c8e846ff
      Nikos Tsironis authored
      dm-era doesn't support changing the data block size of existing devices,
      so check explicitly that the requested block size for a new target
      matches the one stored in the metadata.
      
      Fixes: eec40579 ("dm: add era target")
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Reviewed-by: default avatarMing-Hung Tsai <mtsai@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      c8e846ff
    • Nikos Tsironis's avatar
      dm era: Reinitialize bitset cache before digesting a new writeset · 25249333
      Nikos Tsironis authored
      In case of devices with at most 64 blocks, the digestion of consecutive
      eras uses the writeset of the first era as the writeset of all eras to
      digest, leading to lost writes. That is, we lose the information about
      what blocks were written during the affected eras.
      
      The digestion code uses a dm_disk_bitset object to access the archived
      writesets. This structure includes a one word (64-bit) cache to reduce
      the number of array lookups.
      
      This structure is initialized only once, in metadata_digest_start(),
      when we kick off digestion.
      
      But, when we insert a new writeset into the writeset tree, before the
      digestion of the previous writeset is done, or equivalently when there
      are multiple writesets in the writeset tree to digest, then all these
      writesets are digested using the same cache and the cache is not
      re-initialized when moving from one writeset to the next.
      
      For devices with more than 64 blocks, i.e., the size of the cache, the
      cache is indirectly invalidated when we move to a next set of blocks, so
      we avoid the bug.
      
      But for devices with at most 64 blocks we end up using the same cached
      data for digesting all archived writesets, i.e., the cache is loaded
      when digesting the first writeset and it never gets reloaded, until the
      digestion is done.
      
      As a result, the writeset of the first era to digest is used as the
      writeset of all the following archived eras, leading to lost writes.
      
      Fix this by reinitializing the dm_disk_bitset structure, and thus
      invalidating the cache, every time the digestion code starts digesting a
      new writeset.
      
      Fixes: eec40579 ("dm: add era target")
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      25249333
    • Nikos Tsironis's avatar
      dm era: Update in-core bitset after committing the metadata · 2099b145
      Nikos Tsironis authored
      In case of a system crash, dm-era might fail to mark blocks as written
      in its metadata, although the corresponding writes to these blocks were
      passed down to the origin device and completed successfully.
      
      Consider the following sequence of events:
      
      1. We write to a block that has not been yet written in the current era
      2. era_map() checks the in-core bitmap for the current era and sees
         that the block is not marked as written.
      3. The write is deferred for submission after the metadata have been
         updated and committed.
      4. The worker thread processes the deferred write
         (process_deferred_bios()) and marks the block as written in the
         in-core bitmap, **before** committing the metadata.
      5. The worker thread starts committing the metadata.
      6. We do more writes that map to the same block as the write of step (1)
      7. era_map() checks the in-core bitmap and sees that the block is marked
         as written, **although the metadata have not been committed yet**.
      8. These writes are passed down to the origin device immediately and the
         device reports them as completed.
      9. The system crashes, e.g., power failure, before the commit from step
         (5) finishes.
      
      When the system recovers and we query the dm-era target for the list of
      written blocks it doesn't report the aforementioned block as written,
      although the writes of step (6) completed successfully.
      
      The issue is that era_map() decides whether to defer or not a write
      based on non committed information. The root cause of the bug is that we
      update the in-core bitmap, **before** committing the metadata.
      
      Fix this by updating the in-core bitmap **after** successfully
      committing the metadata.
      
      Fixes: eec40579 ("dm: add era target")
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      2099b145
    • Nikos Tsironis's avatar
      dm era: Recover committed writeset after crash · de89afc1
      Nikos Tsironis authored
      Following a system crash, dm-era fails to recover the committed writeset
      for the current era, leading to lost writes. That is, we lose the
      information about what blocks were written during the affected era.
      
      dm-era assumes that the writeset of the current era is archived when the
      device is suspended. So, when resuming the device, it just moves on to
      the next era, ignoring the committed writeset.
      
      This assumption holds when the device is properly shut down. But, when
      the system crashes, the code that suspends the target never runs, so the
      writeset for the current era is not archived.
      
      There are three issues that cause the committed writeset to get lost:
      
      1. dm-era doesn't load the committed writeset when opening the metadata
      2. The code that resizes the metadata wipes the information about the
         committed writeset (assuming it was loaded at step 1)
      3. era_preresume() starts a new era, without taking into account that
         the current era might not have been archived, due to a system crash.
      
      To fix this:
      
      1. Load the committed writeset when opening the metadata
      2. Fix the code that resizes the metadata to make sure it doesn't wipe
         the loaded writeset
      3. Fix era_preresume() to check for a loaded writeset and archive it,
         before starting a new era.
      
      Fixes: eec40579 ("dm: add era target")
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      de89afc1
  3. 09 Feb, 2021 6 commits
  4. 08 Feb, 2021 1 commit
  5. 03 Feb, 2021 10 commits
  6. 02 Feb, 2021 1 commit
  7. 01 Feb, 2021 2 commits
  8. 29 Jan, 2021 2 commits
  9. 28 Jan, 2021 1 commit
  10. 27 Jan, 2021 8 commits