• Masahiro Yamada's avatar
    kbuild: add a tool to list files ignored by git · 5c3d1d0a
    Masahiro Yamada authored
    In short, the motivation of this commit is to build a source package
    without cleaning the source tree.
    
    The deb-pkg and (src)rpm-pkg targets first run 'make clean' before
    creating a source tarball. Otherwise build artifacts such as *.o,
    *.a, etc. would be included in the tarball. Yet, the tarball ends up
    containing several garbage files since 'make clean' does not clean
    everything.
    
    Cleaning the tree every time is annoying since it makes the incremental
    build impossible. It is desirable to create a source tarball without
    cleaning the tree.
    
    In fact, there are some ways to achieve this.
    
    The easiest solution is 'git archive'. 'make perf-tar*-src-pkg' uses
    it, but I do not like it because it works only when the source tree is
    managed by git, and all files you want in the tarball must be committed
    in advance.
    
    I want to make it work without relying on git. We can do this.
    
    Files that are ignored by git are generated files, so should be excluded
    from the source tarball. We can list them out by parsing the .gitignore
    files. Of course, .gitignore does not cover all the cases, but it works
    well enough.
    
    tar(1) claims to support it:
    
      --exclude-vcs-ignores
    
        Exclude files that match patterns read from VCS-specific ignore files.
        Supported files are: .cvsignore, .gitignore, .bzrignore, and .hgignore.
    
    The best scenario would be to use 'tar --exclude-vcs-ignores', but this
    option does not work. --exclude-vcs-ignore does not understand any of
    the negation (!), preceding slash, following slash, etc.. So, this option
    is just useless.
    
    Hence, I wrote this gitignore parser. The previous version [1], written
    in Python, was so slow. This version is implemented in C, so it works
    much faster.
    
    I imported the code from git (commit: 23c56f7bd5f1), so we get the same
    result.
    
    This tool traverses the source tree, parsing all .gitignore files, and
    prints file paths that are ignored by git.
    
    The output is similar to 'git ls-files --ignored --directory --others
    --exclude-per-directory=.gitignore', except
    
      [1] Not sorted
      [2] No trailing slash for directories
    
    [2] is intentional because tar's --exclude-from option cannot handle
    trailing slashes.
    
    [How to test this tool]
    
      $ git clean -dfx
      $ make -s -j$(nproc) defconfig all                       # or allmodconifg or whatever
      $ git archive -o ../linux1.tar --prefix=./ HEAD
      $ tar tf ../linux1.tar | LANG=C sort > ../file-list1     # files emitted by 'git archive'
      $ make scripts_package
        HOSTCC  scripts/list-gitignored
      $ scripts/list-gitignored  --prefix=./ -o ../exclude-list
      $ tar cf ../linux2.tar --exclude-from=../exclude-list .
      $ tar tf ../linux2.tar | LANG=C sort > ../file-list2     # files emitted by 'tar'
      $ diff  ../file-list1 ../file-list2 | grep -E '^(<|>)'
      < ./Documentation/devicetree/bindings/.yamllint
      < ./drivers/clk/.kunitconfig
      < ./drivers/gpu/drm/tests/.kunitconfig
      < ./drivers/hid/.kunitconfig
      < ./fs/ext4/.kunitconfig
      < ./fs/fat/.kunitconfig
      < ./kernel/kcsan/.kunitconfig
      < ./lib/kunit/.kunitconfig
      < ./mm/kfence/.kunitconfig
      < ./tools/testing/selftests/arm64/tags/
      < ./tools/testing/selftests/arm64/tags/.gitignore
      < ./tools/testing/selftests/arm64/tags/Makefile
      < ./tools/testing/selftests/arm64/tags/run_tags_test.sh
      < ./tools/testing/selftests/arm64/tags/tags_test.c
      < ./tools/testing/selftests/kvm/.gitignore
      < ./tools/testing/selftests/kvm/Makefile
      < ./tools/testing/selftests/kvm/config
      < ./tools/testing/selftests/kvm/settings
    
    The source tarball contains most of files that are tracked by git. You
    see some diffs, but it is just because some .gitignore files are wrong.
    
      $ git ls-files -i -c --exclude-per-directory=.gitignore
      Documentation/devicetree/bindings/.yamllint
      drivers/clk/.kunitconfig
      drivers/gpu/drm/tests/.kunitconfig
      drivers/hid/.kunitconfig
      fs/ext4/.kunitconfig
      fs/fat/.kunitconfig
      kernel/kcsan/.kunitconfig
      lib/kunit/.kunitconfig
      mm/kfence/.kunitconfig
      tools/testing/selftests/arm64/tags/.gitignore
      tools/testing/selftests/arm64/tags/Makefile
      tools/testing/selftests/arm64/tags/run_tags_test.sh
      tools/testing/selftests/arm64/tags/tags_test.c
      tools/testing/selftests/kvm/.gitignore
      tools/testing/selftests/kvm/Makefile
      tools/testing/selftests/kvm/config
      tools/testing/selftests/kvm/settings
    
    [1]: https://lore.kernel.org/all/20230128173843.765212-1-masahiroy@kernel.org/Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
    5c3d1d0a
list-gitignored.c 25.8 KB