1. 03 Nov, 2019 4 commits
    • Mark Lapierre's avatar
      Merge branch 'qa-quarantine-cluster-health-spec' into 'master' · f0267e41
      Mark Lapierre authored
      Quarantine cluster health spec
      
      See merge request gitlab-org/gitlab!19505
      f0267e41
    • Nailia Iskhakova's avatar
    • Stan Hu's avatar
      Merge branch 'bvl-robust-gpg-homedir-cleanup' into 'master' · 7b27e4cb
      Stan Hu authored
      Improve cleanup of gpg-homedirs
      
      See merge request gitlab-org/gitlab!19311
      7b27e4cb
    • Bob Van Landuyt's avatar
      Improve cleanup of gpg-homedirs · 7ac392b4
      Bob Van Landuyt authored
      The `gpg-agent` that could have been spawned here dies when it sees
      it's socket disappear.
      
      However, sometimes it seems like we fail to delete the homedir,
      causing the `gpg-agent` to live on forever. We've noticed that the
      deletion failed in
      http://gitlab.com/gitlab-org/gitlab-foss/issues/36998: there was a
      race condition during the deletion where `gpg-agent` would still be
      modifying files while we've already called `FileUtils.remove_entry`.
      
      This will attempt to delete the directory multiple times, at least 0.1
      seconds apart. This is a naive way of trying to make sure
      we clean up the homedir and count on `gpg-agent` to see that and make
      itself go away.
      
      On a web node we'll attempt for at most 0.5 seconds to clean up the
      directory before failing. In a sidekiq process we'll attempt the
      deletion for up to 2 seconds.
      
      When the cleanup fails, we will now track that exception in
      Sentry to gain some visibility.
      
      This also adds counters for the creation and deletion of tmp
      keychains, which we should be able to correlate to the nubmer of
      zombie `gpg-agent` processes.
      7ac392b4
  2. 02 Nov, 2019 8 commits
  3. 01 Nov, 2019 28 commits