• Stan Hu's avatar
    Fix large S3 uploads failing to finalize · 3714fc8f
    Stan Hu authored
    When large files are uploaded to object storage by Workhorse,
    CarrierWave is responsible for copying these files from their temporary
    location to a final location. However, if the file is above 5 GB, the
    upload will fail outright because AWS requires multipart uploads to be
    used to copy files above that limit.
    
    Even if multipart uploads were used, files containing several gigabytes
    of data would usually fail to complete within the 60-second Web request
    timeout. In one test, a 6 GB file took several minutes to copy with
    fog-aws, while it only took 36 seconds with the aws CLI.  The main
    difference: multithreading.
    
    fog-aws now supports multipart, multithreaded uploads per these pull
    requests:
    
    * https://github.com/fog/fog-aws/pull/578
    * https://github.com/fog/fog-aws/pull/579
    
    For this to work, we also need to patch CarrierWave to use the
    `File#copy` method instead of the Fog connection `copy_object`
    method. We use a concurrency of 10 threads because this is what the AWS
    SDK uses, and it appears to give good performance for large uploads.
    
    `s3_multithreaded_uploads` feature flag. We enable it by default because
    GitLab.com uses Google Compute Storage.
    
    Relates to https://gitlab.com/gitlab-org/gitlab/-/issues/216442
    3714fc8f
s3_multithreaded_uploads.yml 281 Bytes