1. 08 Sep, 2020 10 commits
  2. 21 Aug, 2020 2 commits
    • Leo Le Bouter's avatar
      Add metadata-collect dracut module · 025a9bea
      Leo Le Bouter authored
      To install the dracut module on your current system, change into
      the dracut.module directory then run:
      
      ```
      $ ERP5_USER="user" ERP5_PASS="pass" \
        ERP5_BASE_URL="https://example.local/erp5" \
        make
      $ sudo make install
      ```
      
      To uninstall:
      
      ```
      $ sudo make uninstall
      ```
      
      Then in a dracut.conf file, to include it you can add:
      
      ```
      add_dracutmodules="metadata-collect"
      ```
      
      You will also need to append "ip=dhcp rd.neednet=1" to the
      kernel_cmdline directive inside the dracut.conf so that the
      initramfs requests networking for the agent to upload results.
      
      Make sure the dracut network modules are installed, on Debian
      that is the dracut-network package.
      You can otherwise check their presence using:
      
      ```
      $ dracut --list-modules | grep network
      ```
      
      There you should see a few modules.
      025a9bea
    • Leo Le Bouter's avatar
      Use rustls instead of openssl · 4d94b540
      Leo Le Bouter authored
      With rustls it's easier to embed the root CA certificates inside
      the compiled binary itself using the webpki-roots crate. We need to
      do this because it's the easiest way of getting TLS certificate
      validation working inside the initramfs where /etc/ssl/certs or
      else does not exist.
      4d94b540
  3. 20 Aug, 2020 1 commit
    • Leo Le Bouter's avatar
      Rewrite in Rust to obtain standalone static binary · d2277063
      Leo Le Bouter authored
      In contradiction with Jean-Paul's guidelines on not using Rust due
      to lack of knowledge about it inside Nexedi, I am using it here
      because it is the fastest way for me to get a working standalone
      static binary, I know that language best. Considering we must be
      getting results ASAP, this is the best strategy for me. We may
      later rewrite it in another language if necessary.
      
      A shell script is included to build the static binary, you need
      to install rustup to get rust for musl, an alternative libc that
      allows to create real static binaries that embed libc itself too.
      
      Rustup can be found at: https://rustup.rs/
      
      You can get a musl toolchain with:
        $ rustup target add x86_64-unknown-linux-musl
      
      The acl library is being downloaded and built as a static library
      by the script, and the rust build system will also build a vendored
      copy of openssl as a static library.
      
      Parallel hashing is done a bit differently in that Rust version,
      only files contained in the currently processed directories will be
      hashed in parallel. If there is a single big file in a directory
      hashing will be stuck on that file until it's done and it goes onto
      the next directory. To clarify, each file is only hashed on a
      single thread, the Python version also does this, it just keeps the
      number of files being hashed in parallel to a constant number as
      long as there is more files to process, this version will only hash
      with one thread per file in the currently processed directory. It
      was done that way for sake of simplicity but we can implement an
      offload threadpool to mimick what was done in Python later on.
      d2277063
  4. 19 Aug, 2020 1 commit
  5. 18 Aug, 2020 2 commits
    • Leo Le Bouter's avatar
      Add setup script · e63cabb5
      Leo Le Bouter authored
      e63cabb5
    • Leo Le Bouter's avatar
      Upload results to ERP5 · 7d922faa
      Leo Le Bouter authored
      TODO: Find a way to properly increment version without having to
            store any additional state client-side
      
      TODO: Investigate using HATEOAS to talk to ERP5
      
      TODO: Investigate using TLS client certificates to authenticate,
            they would be stored in /boot and would prevent the machine
            from booting if they were invalid or missing so that
            tampering with them is not interesting for an attacker.
            Also, the certificate's Common Name should be the computer
            reference and therefore should be used to construct the
            metadata snapshot document's reference instead of having
            to specify it on the command line.
      7d922faa
  6. 14 Aug, 2020 2 commits
    • Leo Le Bouter's avatar
      Formatting · d6bebb62
      Leo Le Bouter authored
      d6bebb62
    • Leo Le Bouter's avatar
      Use MsgPack instead of JSON, add command line arguments + bug fixes · 86c55efd
      Leo Le Bouter authored
      * Convert stat_result to proper dictionary so that field names are
        retained after serialization
      
      * Add ability to ignore directories through command line arguments,
        explicitly add "ignored" field on ignored directories
      
      It was decided that JSON was not a suitable format because bytes
      serialization support is lacking. MsgPack supports it and is more
      efficient, also it is the internal serialization format for Fluentd
      which we will most probably use for ingesting data in a central
      place.
      86c55efd
  7. 13 Aug, 2020 3 commits
    • Leo Le Bouter's avatar
      do not follow symlinks in getxattr, close mp_pool first · 02a190aa
      Leo Le Bouter authored
      multiprocessing.Pool.close() ensures no new tasks can be submitted
      to the pool and waits for them to all finish. Even though
      AsyncResult.get() also waits for the tasks to finish, and our code
      structure shouldnt submit new tasks at that point, close() first,
      get() then. In the future this could be error-prone in the future
      where mp_tasks is modified while results are being merged back and
      we miss some results because the iterator wont take these new items
      into account *during* iteration.
      02a190aa
    • Leo Le Bouter's avatar
    • Leo Le Bouter's avatar
      xattrs dict must be created first, decode xattrs as utf-8 · 001ed5c5
      Leo Le Bouter authored
      In Python, the JSON encoder cannot process bytes, the JSON
      specification also does not define a "bytes" type. We are
      constrained by this in that we cannot serialize data of bytes type.
      
      xattrs can be either strings or bytes, in practice they're likely
      representable as strings, therefore, decode as utf-8, error
      otherwise. If real world situation of xattrs in true binary format
      arise then we will rule out another solution.
      001ed5c5
  8. 12 Aug, 2020 1 commit