Commit 155f09e6 authored by Craig Norris's avatar Craig Norris

Merge branch 'docs-aqualls-remove-git-annex' into 'master'

Remove pages and replace with redirects

See merge request gitlab-org/gitlab!60034
parents ab0eab66 8299c621
---
stage: Create
group: Source Code
info: "To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments"
type: reference, howto
disqus_identifier: 'https://docs.gitlab.com/ee/workflow/git_annex.html'
redirect_to: 'index.md'
---
# Git annex
This document was moved to [another location](index.md).
WARNING:
[Git Annex support was removed](https://gitlab.com/gitlab-org/gitlab/-/issues/1648)
in GitLab 9.0. Read through the [migration guide from git-annex to Git LFS](../topics/git/lfs/migrate_from_git_annex_to_git_lfs.md).
The biggest limitation of Git, compared to some older centralized version
control systems has been the maximum size of the repositories.
The general recommendation is to not have Git repositories larger than 1GB to
preserve performance. Although GitLab has no limit (some repositories in GitLab
are over 50GB!), we subscribe to the advice to keep repositories as small as
you can.
Not being able to version control large binaries is a big problem for many
larger organizations.
Videos, photos, audio, compiled binaries, and many other types of files are too
large. As a workaround, people keep artwork-in-progress in a Dropbox folder and
only check in the final result. This results in using outdated files, not
having a complete history, and increases the risk of losing work.
This problem is solved in GitLab Enterprise Edition by integrating the
[git-annex](https://git-annex.branchable.com/) application.
`git-annex` allows managing large binaries with Git without checking the
contents into Git.
You check-in only a symlink that contains the SHA-1 of the large binary. If you
need the large binary, you can sync it from the GitLab server over `rsync`, a
very fast file copying tool.
## GitLab git-annex Configuration
`git-annex` is disabled by default in GitLab. Below are the
configuration options required to enable it.
### Requirements
`git-annex` needs to be installed both on the server and the client-side.
For Debian-like systems (for example, Debian and Ubuntu) this can be achieved by running:
```shell
sudo apt-get update && sudo apt-get install git-annex
```
For RedHat-like systems (for example, CentOS and RHEL) this can be achieved by running:
```shell
sudo yum install epel-release && sudo yum install git-annex
```
### Configuration for Omnibus packages
For Omnibus GitLab packages, only one configuration setting is needed.
The Omnibus package internally sets the correct options in all locations.
1. In `/etc/gitlab/gitlab.rb` add the following line:
```ruby
gitlab_shell['git_annex_enabled'] = true
```
1. Save the file and [reconfigure GitLab](restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
### Configuration for installations from source
There are 2 settings to enable git-annex on your GitLab server.
One is located in `config/gitlab.yml` of the GitLab repository and the other
one is located in `config.yml` of GitLab Shell.
1. In `config/gitlab.yml` add or edit the following lines:
```yaml
gitlab_shell:
git_annex_enabled: true
```
1. In `config.yml` of GitLab Shell add or edit the following lines:
```yaml
git_annex_enabled: true
```
1. Save the files and [restart GitLab](restart_gitlab.md#installations-from-source) for the changes to take effect.
## Using GitLab git-annex
NOTE:
Your Git remotes must be using the SSH protocol, not HTTP(S).
Here is an example workflow of uploading a very large file and then checking it
into your Git repository:
```shell
git clone git@example.com:group/project.git
git annex init 'My Laptop' # initialize the annex project and give an optional description
cp ~/tmp/debian.iso ./ # copy a large file into the current directory
git annex add debian.iso # add the large file to git annex
git commit -am "Add Debian iso" # commit the file metadata
git annex sync --content # sync the Git repo and large file to the GitLab server
```
The output should look like this:
```plaintext
commit
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
(use "git push" to publish your local commits)
nothing to commit, working tree clean
ok
pull origin
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 5 (delta 2), reused 0 (delta 0)
Unpacking objects: 100% (5/5), done.
From example.com:group/project
497842b..5162f80 git-annex -> origin/git-annex
ok
(merging origin/git-annex into git-annex...)
(recording state in git...)
copy debian.iso (checking origin...) (to origin...)
SHA256E-s26214400--8092b3d482fb1b7a5cf28c43bc1425c8f2d380e86869c0686c49aa7b0f086ab2.iso
26,214,400 100% 638.88kB/s 0:00:40 (xfr#1, to-chk=0/1)
ok
pull origin
ok
(recording state in git...)
push origin
Counting objects: 15, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (13/13), done.
Writing objects: 100% (15/15), 1.64 KiB | 0 bytes/s, done.
Total 15 (delta 1), reused 0 (delta 0)
To example.com:group/project.git
* [new branch] git-annex -> synced/git-annex
* [new branch] master -> synced/master
ok
```
Your files can be found in the `master` branch, but more branches are created
by the `annex sync` command.
Git Annex creates a new directory at `.git/annex/` and records the
tracked files in the `.git/config` file. The files you assign to be tracked
with `git-annex` don't affect the existing `.git/config` records. The files
are turned into symbolic links that point to data in `.git/annex/objects/`.
The `debian.iso` file in the example contain the symbolic link:
```plaintext
.git/annex/objects/ZW/1k/SHA256E-s82701--6384039733b5035b559efd5a2e25a493ab6e09aabfd5162cc03f6f0ec238429d.png/SHA256E-s82701--6384039733b5035b559efd5a2e25a493ab6e09aabfd5162cc03f6f0ec238429d.iso
```
Use `git annex info` to retrieve the information about the local copy of your
repository.
---
You can download a single large file with these commands:
```shell
git clone git@gitlab.example.com:group/project.git
git annex sync # sync Git branches but not the large file
git annex get debian.iso # download the large file
```
To download all files:
```shell
git clone git@gitlab.example.com:group/project.git
git annex sync --content # sync Git branches and download all the large files
```
By using `git-annex` without GitLab, anyone that can access the server can also
access the files of all projects. GitLab Annex ensures that you can only
access files of projects you have access to (developer, maintainer, or owner role).
## How it works
Internally GitLab uses [GitLab Shell](https://gitlab.com/gitlab-org/gitlab-shell) to handle SSH access and this was a great
integration point for `git-annex`.
There is a setting in GitLab Shell so you can disable GitLab Annex support
if you want to.
## Troubleshooting tips
Differences in the version of `git-annex` on the GitLab server and on local machines
can cause `git-annex` to raise unpredicted warnings and errors.
Consult the [Annex upgrade page](https://git-annex.branchable.com/upgrades/) for more information about
the differences between versions. You can find out which version is installed
on your server by navigating to `https://pkgs.org/download/git-annex` and
searching for your distribution.
Although there is no general guide for `git-annex` errors, there are a few tips
on how to go around the warnings.
### `git-annex-shell: Not a git-annex or gcrypt repository`
This warning can appear on the initial `git annex sync --content` and is caused
by differences in `git-annex-shell`. You can read more about it
[in this git-annex issue](https://git-annex.branchable.com/forum/Error_from_git-annex-shell_on_creation_of_gcrypt_special_remote/).
One important thing to note is that despite the warning, the `sync` succeeds
and the files are pushed to the GitLab repository.
If you get hit by this, you can run the following command inside the repository
that the warning was raised:
```shell
git config remote.origin.annex-ignore false
```
Consecutive runs of `git annex sync --content` **should not** produce this
warning and the output should look like this:
```plaintext
commit ok
pull origin
ok
pull origin
ok
push origin
```
<!-- This redirect file can be deleted after <2021-07-22>. -->
<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
......@@ -190,7 +190,6 @@ Learn how to install, configure, update, and maintain your GitLab instance.
- [Git LFS configuration](lfs/index.md): Learn how to configure LFS for GitLab.
- [Housekeeping](housekeeping.md): Keep your Git repositories tidy and fast.
- [Configuring Git Protocol v2](git_protocol.md): Git protocol version 2 support.
- [Manage large files with `git-annex` (Deprecated)](git_annex.md)
## Monitoring GitLab
......
......@@ -101,5 +101,4 @@ The following relate to Git Large File Storage:
- [Removing objects from LFS](lfs/index.md#removing-objects-from-lfs)
- [GitLab Git LFS user documentation](lfs/index.md)
- [GitLab Git LFS admin documentation](../../administration/lfs/index.md)
- [Git Annex to Git LFS migration guide](lfs/migrate_from_git_annex_to_git_lfs.md)
- [Towards a production quality open source Git LFS server](https://about.gitlab.com/blog/2015/08/13/towards-a-production-quality-open-source-git-lfs-server/)
---
stage: Create
group: Source Code
info: "To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments"
type: reference, howto
redirect_to: 'index.md'
---
# Migration guide from Git Annex to Git LFS **(FREE)**
This document was moved to [another location](index.md).
WARNING:
Git Annex support [has been removed](https://gitlab.com/gitlab-org/gitlab/-/issues/1648) in GitLab Enterprise
Edition 9.0 (2017/03/22).
Both [Git Annex](http://git-annex.branchable.com/) and [Git LFS](https://git-lfs.github.com/) are tools to manage large files in Git.
## History
Git Annex [was introduced in GitLab Enterprise Edition 7.8](https://about.gitlab.com/blog/2015/02/17/gitlab-annex-solves-the-problem-of-versioning-large-binaries-with-git/), at a time
where Git LFS didn't yet exist. A few months later, GitLab brought support for
Git LFS in [GitLab 8.2](https://about.gitlab.com/blog/2015/11/23/announcing-git-lfs-support-in-gitlab/) and is available for both Community and
Enterprise editions.
## Differences between Git Annex and Git LFS
Some items below are general differences between the two protocols and some are
ones that GitLab developed.
- Git Annex works only through SSH, whereas Git LFS works both with SSH and HTTPS
(SSH support was added in GitLab 8.12).
- Annex files are stored in a sub-directory of the normal repositories, whereas
LFS files are stored outside of the repositories in a place you can define.
- Git Annex requires a more complex setup, but has much more options than Git
LFS. You can compare the commands each one offers by running `man git-annex`
and `man git-lfs`.
- Annex files cannot be browsed directly in the GitLab interface, whereas LFS
files can.
## Migration steps
Git Annex files are stored in a sub-directory of the normal repositories
(`.git/annex/objects`) and LFS files are stored outside of the repositories.
The two aren't compatible as they are using a different scheme. Therefore, the
migration has to be done manually per repository.
There are basically two steps you need to take in order to migrate from Git
Annex to Git LFS.
### TL; DR
If you know what you are doing and want to skip the reading, this is what you
need to do (we assume you have [git-annex enabled](../../../administration/git_annex.md#using-gitlab-git-annex) in your
repository and that you have made backups in case something goes wrong).
Fire up a terminal, navigate to your Git repository and:
1. Disable `git-annex`:
```shell
git annex sync --content
git annex direct
git annex uninit
git annex indirect
```
1. Enable `git-lfs`:
```shell
git lfs install
git lfs track <files>
git add .
git commit -m "commit message"
git push
```
### Disabling Git Annex in your repository
Before changing anything, make sure you have a backup of your repository first.
There are a couple of ways to do that, but you can clone it to another
local path and maybe push it to GitLab if you want a remote backup as well.
A guide on
[how to back up a **git-annex** repository to an external hard drive](https://www.thomas-krenn.com/en/wiki/Git-annex_Repository_on_an_External_Hard_Drive) is also available.
Because Annex files are stored as objects with symlinks and cannot be directly
modified, we need to first remove those symlinks.
NOTE:
Make sure the you read about the [`direct` mode](https://git-annex.branchable.com/direct_mode/) as it contains
information that may fit in your use case. The `annex direct` command is
deprecated in Git Annex version 6, so you may need to upgrade your repository
if the server also has Git Annex 6 installed. Read more in the
[Git Annex troubleshooting tips](../../../administration/git_annex.md#troubleshooting-tips) section.
1. Backup your repository
```shell
cd repository
git annex sync --content
cd ..
git clone repository repository-backup
cd repository-backup
git annex get
cd ..
```
1. Use `annex direct`:
```shell
cd repository
git annex direct
```
The output should be similar to this:
```shell
commit
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working tree clean
ok
direct debian.iso ok
direct ok
```
1. Disable Git Annex with [`annex uninit`](https://git-annex.branchable.com/git-annex-uninit/):
```shell
git annex uninit
```
The output should be similar to this:
```shell
unannex debian.iso ok
Deleted branch git-annex (was 2534d2c).
```
This command runs `unannex` on every file in the repository, leaving the original files.
1. Switch back to `indirect` mode:
```shell
git annex indirect
```
The output should be similar to this:
```shell
(merging origin/git-annex into git-annex...)
(recording state in git...)
commit (recording state in git...)
ok
(recording state in git...)
[master fac3194] commit before switching to indirect mode
1 file changed, 1 deletion(-)
delete mode 120000 alpine-virt-3.4.4-x86_64.iso
ok
indirect ok
ok
```
---
At this point, you have two options. Either add, commit and push the files
directly back to GitLab or switch to Git LFS. The LFS switch is described in
the next section.
### Enabling Git LFS in your repository
Git LFS is enabled by default on all GitLab products (GitLab CE, GitLab EE,
GitLab.com), therefore, you don't need to do anything server-side.
1. First, make sure you have `git-lfs` installed locally:
```shell
git lfs help
```
If the terminal doesn't prompt you with a full response on `git-lfs` commands,
[install the Git LFS client](https://git-lfs.github.com/) first.
1. Inside the repository, run the following command to initiate LFS:
```shell
git lfs install
```
1. Enable `git-lfs` for the group of files you want to track. You
can track specific files, all files containing the same extension, or an
entire directory:
```shell
git lfs track images/01.png # per file
git lfs track **/*.png # per extension
git lfs track images/ # per directory
```
After this, run `git status` to see the `.gitattributes` added
to your repository. It collects all file patterns that you chose to track via
`git-lfs`.
1. Add the files, commit and push them to GitLab:
```shell
git add .
git commit -m "commit message"
git push
```
If your remote is set up with HTTP, you are asked to enter your login
credentials. If you have [2FA enabled](../../../user/profile/account/two_factor_authentication.md), make sure to use a
[personal access token](../../../user/profile/account/two_factor_authentication.md#personal-access-tokens)
instead of your password.
## Removing the Git Annex branches
After the migration finishes successfully, you can remove all `git-annex`
related branches from your repository.
On GitLab, navigate to your project's **Repository > Branches** and delete all
branches created by Git Annex: `git-annex`, and all under `synced/`.
![repository branches](img/git-annex-branches.png)
You can also do this on the command line with:
```shell
git branch -d synced/master
git branch -d synced/git-annex
git push origin :synced/master
git push origin :synced/git-annex
git push origin :git-annex
git remote prune origin
```
If there are still some Annex objects inside your repository (`.git/annex/`)
or references inside `.git/config`, run `annex uninit` again:
```shell
git annex uninit
```
## Further Reading
- (Blog Post) [Getting Started with Git FLS](https://about.gitlab.com/blog/2017/01/30/getting-started-with-git-lfs-tutorial/)
- (Blog Post) [Announcing LFS Support in GitLab](https://about.gitlab.com/blog/2015/11/23/announcing-git-lfs-support-in-gitlab/)
- (Blog Post) [GitLab Annex Solves the Problem of Versioning Large Binaries with Git](https://about.gitlab.com/blog/2015/02/17/gitlab-annex-solves-the-problem-of-versioning-large-binaries-with-git/)
- [Git Annex](../../../administration/git_annex.md)
- [Git LFS](index.md)
<!-- This redirect file can be deleted after <2021-07-22>. -->
<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
......@@ -174,7 +174,6 @@ but commented out to help encourage others to add to it in the future. -->
## References
- [Getting Started with Git LFS](https://about.gitlab.com/blog/2017/01/30/getting-started-with-git-lfs-tutorial/)
- [Migrate from Git Annex to Git LFS](migrate_from_git_annex_to_git_lfs.md)
- [GitLab Git LFS user documentation](index.md)
- [GitLab Git LFS administrator documentation](../../../administration/lfs/index.md)
- Alternative method to [migrate an existing repository to Git LFS](https://github.com/git-lfs/git-lfs/wiki/Tutorial#migrating-existing-repository-data-to-lfs)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment