Fix large S3 uploads failing to finalize
When large files are uploaded to object storage by Workhorse, CarrierWave is responsible for copying these files from their temporary location to a final location. However, if the file is above 5 GB, the upload will fail outright because AWS requires multipart uploads to be used to copy files above that limit. Even if multipart uploads were used, files containing several gigabytes of data would usually fail to complete within the 60-second Web request timeout. In one test, a 6 GB file took several minutes to copy with fog-aws, while it only took 36 seconds with the aws CLI. The main difference: multithreading. fog-aws now supports multipart, multithreaded uploads per these pull requests: * https://github.com/fog/fog-aws/pull/578 * https://github.com/fog/fog-aws/pull/579 For this to work, we also need to patch CarrierWave to use the `File#copy` method instead of the Fog connection `copy_object` method. We use a concurrency of 10 threads because this is what the AWS SDK uses, and it appears to give good performance for large uploads. `s3_multithreaded_uploads` feature flag. We enable it by default because GitLab.com uses Google Compute Storage. Relates to https://gitlab.com/gitlab-org/gitlab/-/issues/216442
Showing
Please register or sign in to comment