1. 07 Jul, 2021 1 commit
    • Kirill Smelkov's avatar
      Fix module-based build · c9db60e8
      Kirill Smelkov authored
      @lpgeneau reports that module-based build fails with
      
          2021-07-07 16:37:34 slapos[2415] INFO GOMODSRC /srv/slapgrid/slappart11/srv/runner/software/a12cdf3481c4202ae72cb255d8a6c183/go.work/src/lab.nexedi.com/kirr/git-backup:./...
          2021-07-07 16:37:34 slapos[2415] INFO ../../../../pkg/mod/lab.nexedi.com/kirr/go123@v0.0.0-20210302025843-863c4602a230/xerr/xerr.go:77:2: missing go.sum entry for module providing package github.com/pkg/errors (imported by lab.nexedi.com/kirr/go123/xerr); to add:
          2021-07-07 16:37:34 slapos[2415] INFO   go get lab.nexedi.com/kirr/go123/xerr@v0.0.0-20210302025843-863c4602a230
          2021-07-07 16:37:34 slapos[2415] INFO gowork.goinstall: Non zero exit code (1) while running command.
      
      -> Fix it by updating go123 dependency and running recommended command to update go.sum.
      
      Amends 3c804105 (*: Add Go modules support; Upgrade to latest git2go release)
      c9db60e8
  2. 02 Mar, 2021 1 commit
  3. 02 Jul, 2020 1 commit
  4. 20 May, 2020 3 commits
  5. 25 Feb, 2020 1 commit
  6. 10 Feb, 2020 3 commits
  7. 13 Jan, 2020 1 commit
    • Kirill Smelkov's avatar
      restore: Rework extraction pipeline to use xsync.WorkGroup · 6af054b0
      Kirill Smelkov authored
      The pattern where multiple workers are spawned to work on a common task
      and where whole work needs to be canceled on first error is now well
      understood, with the functionality to broadcast cancel and propagate
      errors being wrapped into libraries such as
      
      	https://godoc.org/golang.org/x/sync/errgroup			and
      	https://godoc.org/lab.nexedi.com/kirr/go123/xsync#WorkGroup
      	(go123@515a6d14)
      
      Let's streamline the code by using xsync.WorkGroup (it is in our hands,
      a bit more well designed (imho), has analog in Pygolang, and can be
      changed/enhanced as needed).
      
      The other reason to rework the code is that the workgroup is created
      under context (currently always background) and can be canceled by that
      context cancel. In the next patch we'll teach all git-backup
      subcommands, including restore, to work under context, and by using
      xsync.WorkGroup we will automatically handle cancellation from outside,
      while without reworking extraction pipeline we would need to
      additionally glue ctx cancel to signal to workers to stop.
      
      Compared to previous state both xsync.WorkGroup and errogroup return
      only the first error, however it should likely not cause problems in
      practice as the first error is usually the most informative one.
      6af054b0
  8. 05 Jan, 2020 2 commits
    • Kirill Smelkov's avatar
      gitlab-backup: pull|restore: Cleanup $tmpd in defer-style · 00f58d0b
      Kirill Smelkov authored
      Similarly to previous patch, let's cleanup gitlab-backup temporary
      folder always unconditionally in the presence of errors. Keeping $tmpd
      on error was not preventing further gitlab-backup run to proceed, but it
      can quickly eat up disk space if there are many such runs. If debugging
      is needed one can comment the cleanup, but by default let's be
      production friendly out of the box.
      
      Based on patch by @alain.takoudjou:
      !4
      
      Original description from Alain:
      
      ---- 8< ----
      When script exit, remove tmp backup folder which are not longuer needed.
      Keep this folder when backup is failing will contribute to fill the disk
      of server. backup.locked is also removed, because we want to
      automatically retry gitlab-backup if previous backup failed, without
      human action. If the file is not removed automatically, backup is
      blocked until someone remove it.
      00f58d0b
    • Kirill Smelkov's avatar
      pull: Don't leave backup repository locked on error · 2cc61da3
      Kirill Smelkov authored
      On pull git-backup locks backup repository to make sure another
      concurrent `git-backup pull` process is not running. However until now,
      if a pull was failing, the lock was left unreleased, which made followup
      pull attempts to fail while acquiring the lock until the lock was
      manually removed with `git update-ref -d ...`. Probably originally I
      made it like this in 6f237f22 (git-backup: Initial draft) to make sure
      that if there is a problem it does not go unnoticed and forces me to
      investigate. But in general we do _not_ need to keep the lock on error
      return after `git-backup pull` completes even abnormally.
      
      This "lock left unreleased" is causing operational issues on
      lab.nexedi.com from time to time: if a pull try fails for some, even
      temporary, reason, all next pull tries will fail until a human intervene
      and remove the lock ref.
      
      Fix it.
      
      See also: !4
      2cc61da3
  9. 29 Aug, 2018 1 commit
    • Kirill Smelkov's avatar
      Fix build with Go1.11 · 9791c04e
      Kirill Smelkov authored
      	# lab.nexedi.com/kirr/git-backup
      	./git.go:177: Raisef call needs 1 arg but has 2 args
      
      The bug was there from day 1 after rewrite in Go in 28986e0e.
      9791c04e
  10. 20 Jun, 2018 1 commit
  11. 13 Jun, 2018 1 commit
  12. 12 Jun, 2018 4 commits
    • Kirill Smelkov's avatar
    • Kirill Smelkov's avatar
      pull: Speedup fetching by prebuilding index of objects we already have at start · 3efed898
      Kirill Smelkov authored
      Like it was already said in 899103bf (pull: Switch from porcelain `git
      fetch` to plumbing `git fetch-pack` + friends) currently on
      lab.nexedi.com `git-backup pull` became slow and most of the slowness
      was tracked down to the fact that `git fetch` for every pulled repository does
      linear scan of whole backup repository history just to find out there is
      usually nothing to fetch. Quoting 899103bf:
      
      """
          `git fetch`, before fetching data from remote repository, first checks
          whether it already locally has all the objects remote advertises. This
          boils down to running
      
      	echo $remote_tips | git rev-list --quiet --objects --stdin --not --all
      
          and checking whether it succeeds or not:
      
      	https://git.kernel.org/pub/scm/git/git.git/commit/?h=4191c35671
      	https://git.kernel.org/pub/scm/git/git.git/tree/builtin/fetch.c?h=v2.18.0-rc1-1-g6f333ff2fb#n925
      	https://git.kernel.org/pub/scm/git/git.git/tree/connected.c?h=v2.18.0-rc1-1-g6f333ff2fb#n8
      
          The "--not --all" in the query means that objects should be not
          reachable from all locally existing refs and is implemented by linearly
          scanning from tip of those existing refs and marking objects reachable
          from there as "do not print".
      
          In case of git-backup, where we have mostly master which is super commit
          merging from whole histories of all projects and from backup history,
          linearly scanning from such a tip goes through lots of commits. Up to
          the point where fetching a small, outdated repository, which was already
          pulled into backup and did not changed since long, takes more than 30
          seconds with almost 100% of that time being spent in quickfetch() only.
      """
      
      The solution is that we can build index of objects we already have ourselves
      only once at startup, and then in fetch, after checking lsremote output, consult
      that index, and if we see we already have everything for an advertised
      reference - just avoid giving it to fetch-pack to process. It turns out for
      many pulled repositories there is no references changed at all and this way
      fetch-pack can be skipped completely. This leads to dramatical speedup: before
      `gitlab-backup pull` was taking ~ 2 hours, and now something under ~ 5 minutes.
      
      The index building itself takes ~ 30 seconds - the time which we were
      previously spending to fetch just from 1 unchanged repository. The index size
      is small and so it all can be kept in RAM - please see details in the code
      comments on this.
      
      I initially wanted to speedup fetching by teaching `git fetch-objects` to
      consult backup repo bitmap reachability index (if, for a commit, we can see
      that there is an entry in this index -> we know we already have all reachable
      objects for this commit and can skip fetching). This won't however work
      fully for all our refs - 40% of them are mostly tags, and since in the backup
      repository we don't keep tag objects - we keep tags/tree/blobs encoded as
      commits - sha1 of those 40% references to tags won't be in bitmap index.
      
      So just do the indexing ourselves.
      3efed898
    • Kirill Smelkov's avatar
      Factor out backup.refs loading code from restore · 1be6aaaa
      Kirill Smelkov authored
      In the next patch we will need to load backup.refs in the beginning of
      pull too. Factored function changed to return regular error instead of
      raising exception (which will be the general plan from now on).
      1be6aaaa
    • Kirill Smelkov's avatar
      pull: Switch from porcelain `git fetch` to plumbing `git fetch-pack` + friends · 899103bf
      Kirill Smelkov authored
      On lab.nexedi.com `git-backup pull` became slow, and most of the slowness
      was tracked down to the following:
      
      `git fetch`, before fetching data from remote repository, first checks
      whether it already locally has all the objects remote advertises. This
      boils down to running
      
      	echo $remote_tips | git rev-list --quiet --objects --stdin --not --all
      
      and checking whether it succeeds or not:
      
      	https://git.kernel.org/pub/scm/git/git.git/commit/?h=4191c35671
      	https://git.kernel.org/pub/scm/git/git.git/tree/builtin/fetch.c?h=v2.18.0-rc1-1-g6f333ff2fb#n925
      	https://git.kernel.org/pub/scm/git/git.git/tree/connected.c?h=v2.18.0-rc1-1-g6f333ff2fb#n8
      
      The "--not --all" in the query means that objects should be not
      reachable from all locally existing refs and is implemented by linearly
      scanning from tip of those existing refs and marking objects reachable
      from there as "do not print".
      
      In case of git-backup, where we have mostly master which is super commit
      merging from whole histories of all projects and from backup history,
      linearly scanning from such a tip goes through lots of commits. Up to
      the point where fetching a small, outdated repository, which was already
      pulled into backup and did not changed since long, takes more than 30
      seconds with almost 100% of that time being spent in quickfetch() only.
      
      The solution will be to optimize checking whether we already have all the
      remote objects and to not repeat whole backup-repo scanning for every
      pulled repository. This will be done via first querying through `git
      ls-remote` what tips remote repository has, then checking on
      git-backup specific index which tips we already have and then fetching
      only the rest. This way we are essentially moving most of quickfetch
      phase of git into git-backup.
      
      Since we'll be tailing to git to fetch only some of the remote refs, we
      will either have to amend ourselves the refs `git fetch` creates after
      fetching, or to not rely on `git fetch` creating any refs at all. Since
      we already have a long standing issue that many many refs that are
      coming live after `git fetch` slow down further git fetches
      
      https://lab.nexedi.com/kirr/git-backup/blob/0ab7bbb6/git-backup.go#L551
      
      the longer term plan will be not to create unneeded references.
      Since 2 forks could have references covering the same commits, we would
      either have to compare references created after git-fetch and deduplicate
      them or manage references creation ourselves.
      
      It is also generally better to split `git fetch` into steps at plumbing
      layer, because after doing so, we can have the chance to optimize or
      tweak any of the steps at our side with knowing full git-backup context
      and indices.
      
      This commit only switches from using `git fetch` to its plumbing
      counterpart `git fetch-pack` + friends + manually creating fetched refs
      the way `git fetch` used to do exactly. There should be neither
      functionality changed nor any speedup.
      
      Further commits will start to take advantage of the switch and optimize
      `git-backup pull`.
      899103bf
  13. 11 Jun, 2018 2 commits
    • Kirill Smelkov's avatar
      Clarify git Ref* types a bit · 350a01f9
      Kirill Smelkov authored
      - tell that reference name always goes without "refs/" prefix
      - use .name for reference name, not .ref: this way
      
      	ref.name
      
        is more readable than
      
      	ref.ref
      
        and so there is less need to use for __ in range loops.
      350a01f9
    • Kirill Smelkov's avatar
      restore: Show details when extracted repo refs were found corrupt · 23e07d70
      Kirill Smelkov authored
      Noticed this while changing how pull works and making error there
      incidentally with leaving more "refs/" prefix. With the error before
      this patch tests show:
      
              git-backup_test.go:91: git-backup_test.go:204: lab.nexedi.com/kirr/git-backup.cmd_restore: 2 errors:
      			- E: extracted /tmp/t-git-backup981909377/1/dir 2 + β/repo with+fragile name %αβγ.git refs corrupt:
      			- E: extracted /tmp/t-git-backup981909377/1/dir/hello.git refs corrupt:
      
      with the patch tests report:
      
              git-backup_test.go:91: git-backup_test.go:204: lab.nexedi.com/kirr/git-backup.cmd_restore: 2 errors:
                              - E: extracted /tmp/t-git-backup981909377/1/dir 2 + β/repo with+fragile name %αβγ.git refs corrupt:
      
                      want:
                      cbb6d3f205749888f77fb1a88fbac3b8a0b8000f refs/refs/heads/master
      
                      have:
                      cbb6d3f205749888f77fb1a88fbac3b8a0b8000f refs/heads/master
                              - E: extracted /tmp/t-git-backup981909377/1/dir/hello.git refs corrupt:
      
                      want:
                      647e137fd3b31939b36889eba854a298ef97b6ff refs/refs/heads/branch2
                      feeed96ca75fcf8dcf183008f61dbf72e91ab4de refs/refs/heads/master
                      11e67095628aa17b03436850e690faea3006c25d refs/refs/tags/tag-to-blob
                      f735011c9fcece41219729a33f7876cd8791f659 refs/refs/tags/tag-to-commit
                      7124713e403925bc772cd252b0dec099f3ced9c5 refs/refs/tags/tag-to-tag
                      ba899e5639273a6fa4d50d684af8db1ae070351e refs/refs/tags/tag-to-tree
                      7a3343f584218e973165d943d7c0af47a52ca477 refs/refs/test/ref-to-blob
                      61882eb85774ed4401681d800bb9c638031375e2 refs/refs/test/ref-to-tree
      
                      have:
                      647e137fd3b31939b36889eba854a298ef97b6ff refs/heads/branch2
                      feeed96ca75fcf8dcf183008f61dbf72e91ab4de refs/heads/master
                      11e67095628aa17b03436850e690faea3006c25d refs/tags/tag-to-blob
                      f735011c9fcece41219729a33f7876cd8791f659 refs/tags/tag-to-commit
                      7124713e403925bc772cd252b0dec099f3ced9c5 refs/tags/tag-to-tag
                      ba899e5639273a6fa4d50d684af8db1ae070351e refs/tags/tag-to-tree
                      7a3343f584218e973165d943d7c0af47a52ca477 refs/test/ref-to-blob
                      61882eb85774ed4401681d800bb9c638031375e2 refs/test/ref-to-tree
      
      Should be good to have this details if something really breaks after restore.
      23e07d70
  14. 08 Jun, 2018 2 commits
  15. 05 Jun, 2018 1 commit
  16. 25 Apr, 2018 1 commit
    • Alain Takoudjou's avatar
      gitlab-backup: don't keep backup_gitlab.pulled files · 0b8d834b
      Alain Takoudjou authored
      add option to remove or keep pulled backup data
      
      [ kirr: The .pulled files with gitlab backup data (SQL and the like)
        were originally not removed "just in case" in the early days of
        git/gitlab-backup. They are clearly not needed to be kept since their
        content is entered into git backup database by gitlab-backup, and
        leaving those .pulled files just wastes disk space.
      
        So default to not keep them around and for now add an option to
        forcibly preserve the raw gitlab backup if we'll need it just in case or
        for the debugging.
      
        However if it turns out we won't really need -keep in practice, it
        might go away in some time. ]
      
      /reviewed-on !3
      0b8d834b
  17. 07 Mar, 2018 1 commit
  18. 24 Oct, 2017 1 commit
    • Kirill Smelkov's avatar
      Relicense to GPLv3+ with wide exception for all Free Software / Open Source... · e37d99b4
      Kirill Smelkov authored
      Relicense to GPLv3+ with wide exception for all Free Software / Open Source projects + Business options.
      
      Nexedi stack is licensed under Free Software licenses with various exceptions
      that cover three business cases:
      
      - Free Software
      - Proprietary Software
      - Rebranding
      
      As long as one intends to develop Free Software based on Nexedi stack, no
      license cost is involved. Developing proprietary software based on Nexedi stack
      may require a proprietary exception license. Rebranding Nexedi stack is
      prohibited unless rebranding license is acquired.
      
      Through this licensing approach, Nexedi expects to encourage Free Software
      development without restrictions and at the same time create a framework for
      proprietary software to contribute to the long term sustainability of the
      Nexedi stack.
      
      Please see https://www.nexedi.com/licensing for details, rationale and options.
      e37d99b4
  19. 19 Apr, 2017 1 commit
  20. 13 Dec, 2016 4 commits
  21. 03 Nov, 2016 1 commit
    • Kirill Smelkov's avatar
      Don't be fooled by strings.Split(..., "\n") result always having empty "" last element · 3ba6cf73
      Kirill Smelkov authored
      By definition of strings.Split(..., sep) it "slices s into all substrings
      separated by sep and returns a slice of the substrings between those
      separators". That means that
      
          string.Split("hello\nworld\n", "\n") -> ["hello", "world", ""])     # NOTE the last ""
      
      when parsing file by lines, it is handy though to do not get last empty
      "" after last "\n". #6 shows how we missed to do that filtering-out for
      case of empty backup.refs file and errored-out because of that.
      
      To fix let's introduce a helper - splitlines(), which does the job of
      filtering-out last empty entry after last separator. By using this
      helper everywhere we can hopefully avoid problems while pulling only
      empty repositories (#6 case), and also similar ones.
      
      Fixes #6
      /reported-by @iv
      3ba6cf73
  22. 01 Aug, 2016 3 commits
    • Kirill Smelkov's avatar
      pull: Don't let a lot of empty directories stay under refs/backup/... work prefix after end of pull · 7535343c
      Kirill Smelkov authored
      Continuing 62374038 (pull: Turns unused refs are removed not 100% and a
      lot of empty directories are accumulated) we just make sure to remove
      them in the end of pull.
      
      But NOTE: there could be O(n^2) behaviour still hidden, so it makes
      sense to eventually revisit it and cleanup empty dirs earlier.
      
      For now we just care not to degrade future pull performance. The
      appropriate time for revisiting could be when reworking pull to do
      fetches in parallel.
      
      Updates: https://lab.nexedi.com/lab.nexedi.com/lab.nexedi.com/issues/4
      7535343c
    • Kirill Smelkov's avatar
      restore: Extract packs in multiple workers · ff2f0b67
      Kirill Smelkov authored
      This way it allows us to leverage multiple CPUs on a system for pack
      extractions, which are computation-heavy operations.
      
      The way to do is more-or-less classical:
      
          - main worker prepares requests for pack extraction jobs
      
          - there are multiple pack-extraction workers, which read requests
            from jobs queue and perform them
      
          - at the end we wait for everything to stop, collect errors and
            optionally signalling the whole thing to cancel if we see an error
            coming. (it is only a signal and we still have to wait for
            everything to stop)
      
      The default number of workers is N(CPU) on the system - because we spawn
      separate `git pack-objects ...` for every request.
      
      We also now explicitly limit N(CPU) each `git pack-objects ...` can use
      to 1. This way control how many resources to use is in git-backup hand
      and also git packs better this way (when only using 1 thread) because
      when deltifying all objects are considered to each other, not only all
      objects inside 1 thread's object poll, and even when pack.threads is not
      1, first "objects counting" phase of pack is serial - wasting all but 1
      core.
      
      On lab.nexedi.com we already use pack.threads=1 by default in global
      gitconfig, but the above change is for code to be universal.
      
      Time to restore nexedi/ from lab.nexedi.com backup:
      
      2CPU laptop:
      
          before (pack.threads=1)     10m11s
          before (pack.threads=NCPU)   9m13s
          after  -j1                  10m11s
          after                        6m17s
      
      8CPU system (with other load present, noisy) :
      
          before (pack.threads=1)     ~5m
          after                       ~1m30s
      ff2f0b67
    • Kirill Smelkov's avatar
      raisef: Fix it wrt erraddcallingcontext() · 6c2abbbf
      Kirill Smelkov authored
      like in 302aaaea (raiseif: Fix it wrt erraddcallingcontext()) now fix
      raisef, which I originally overlooked.
      6c2abbbf
  23. 31 Jul, 2016 3 commits
    • Kirill Smelkov's avatar
      xcommit_tree: Teach it to create commit without spawning `git commit-tree ...` · 3a7b390c
      Kirill Smelkov authored
      Because spawning separate process per 1 commit is slow.
      
      Libgit2 does not allow to create commits only knowing tree & parentv
      sha1s, but we can create commit objects by hand pretty easily - their format is
      
          tree <sha1>
          parent <parent1-sha1>
          parent <parent2-sha1>
          ...
          author user <email> date +offset
          committer user <email> date +offset
          LF
          message
      
      Time for pulling-in kirr/slapos.git
      
      before: 2.5s
      after:  0.9s
      
      NOTE AuthorInfo is changed to inherit from git.Signature (same fields
          and semantic)
      
      NOTE Since libgit2 default ident can fail, and does not look beyond
          user.name and user.email we do backup identity detection
          (user/hostname) - in similar way Git does - ourselves.
      3a7b390c
    • Kirill Smelkov's avatar
      Move xcommit_tree() & friends to gitobjects.go · cc450765
      Kirill Smelkov authored
      We are going to rework this function, but before adding changes let's
      move it to more appropriate place. Since xcommit_tree() creates commit
      object from tree and parents and is pretty standard git function - the
      appropriate place is gitobjects.
      
      NOTE we cannot just replace xcommit_tree() with g.CreateCommit() as the
          latter works with already loaded tree and parent objects, but we
          want to be able to make commits only knowing tree and parents sha1.
      cc450765
    • Kirill Smelkov's avatar
      Verify tag/tree/blob encoding is consistent and always the same · 5aac4734
      Kirill Smelkov authored
      In upcoming patch we are going to switch xcommit_tree() to our own
      implementation, and since this can potentially change how commits are
      represented, for backward compatibility reason we need to make sure
      objects encoded as commits stay the same.
      
      So for all kind of objects (they are present in testdata/ repositories)
      add checks that:
      
          - encode/decode is idempotent
          - encoding and decoding produces exactly expected sha1
      
      One nice side effect of this is that we can now remove runtime
      consistency check from tail of decoding. That check was there from the
      beginning - from 6f237f22 (git-backup: Initial draft) mainly present
      because there was no testsuite at that time. That check place is however
      even not completely right - in case we somehow wrongly pulled an object
      it has to be detected at pull time, not restore time. So that check was
      checking only 1/2 of implementation - and not the main one - that
      decoding does not mess up.
      
      Since now we have proper testsuite and add encode/decode tests in this
      patch, we can remove that partial runtime check. And even if decoding
      messes something up, despite having it testsuited, it will be 100%
      caught by restore process, because for an extracted repository, if
      there is no some object which needs to be present in it, pack generation
      for that repository will fail. So we can be safe with the removal.
      
      Time for restoring kirr/slapos.git from lab.nexedi.com backup
      
      before: 5.5s
      after:  3.5s
      
      ( so much because there are ~ 500 tags in slapos.git and currently tag
        encoding is done with spawning separate subprocess per tag )
      5aac4734