Merge remote-tracking branch 'upstream/master' into ce-to-ee-2018-06-22

990bb8b3 · GitLab Bot · 49090c95 · 70bf08b5 · 990bb8b3 · 990bb8b3
Commit 990bb8b3 authored Jun 22, 2018 by GitLab Bot
10 changed files
--- a/changelogs/unreleased/bjk-48176_ruby_gc.yml
+++ b/changelogs/unreleased/bjk-48176_ruby_gc.yml
+---
+title: Cleanup Prometheus ruby metrics
+merge_request: 20039
+author: Ben Kochie
+type: fixed
--- a/changelogs/unreleased/blackst0ne-rails5-fix-data-store-spec.yml
+++ b/changelogs/unreleased/blackst0ne-rails5-fix-data-store-spec.yml
+---
+title: '[Rails5] Fix "-1 is not a valid data_store"'
+merge_request: 19917
+author: "@blackst0ne"
+type: fixed
--- a/doc/administration/monitoring/prometheus/gitlab_metrics.md
+++ b/doc/administration/monitoring/prometheus/gitlab_metrics.md
@@ -87,6 +87,20 @@ the `monitoring.sidekiq_exporter` configuration option in `gitlab.yml`.
 | geo_wikis_verification_failed_count         | Gauge   | 10.7  | Number of wikis failed to verify on secondary | url
 | geo_wikis_checksum_mismatch_count           | Gauge   | 10.7  | Number of wikis that checksum mismatch on secondary | url
+### Ruby metrics
+Some basic Ruby runtime metrics are available:
+| Metric                                 | Type      | Since | Description |
+|:-------------------------------------- |:--------- |:----- |:----------- |
+| ruby_gc_duration_seconds_total         | Counter   | 11.1  | Time spent by Ruby in GC |
+| ruby_gc_stat_...                       | Gauge     | 11.1  | Various metrics from [GC.stat] |
+| ruby_file_descriptors                  | Gauge     | 11.1  | File descriptors per process |
+| ruby_memory_bytes                      | Gauge     | 11.1  | Memory usage by process |
+| ruby_sampler_duration_seconds_total    | Counter   | 11.1  | Time spent collecting stats |
+[GC.stat]: https://ruby-doc.org/core-2.3.0/GC.html#method-c-stat
 ## Metrics shared directory
 GitLab's Prometheus client requires a directory to store metrics data shared between multi-process services.

--- a/doc/workflow/lfs/lfs_administration.md
+++ b/doc/workflow/lfs/lfs_administration.md
@@ -17,7 +17,7 @@ There are various configuration options to help GitLab server administrators:
 * Enabling/disabling Git LFS support
 * Changing the location of LFS object storage
-* Setting up AWS S3 compatible object storage
+* Setting up object storage supported by [Fog](http://fog.io/about/provider_documentation.html)
 ### Configuration for Omnibus installations
@@ -44,19 +44,31 @@ In `config/gitlab.yml`:
    storage_path: /mnt/storage/lfs-objects
 ```
-## Storing the LFS objects in an S3-compatible object storage
+## Storing LFS objects in remote object storage
 > [Introduced][ee-2760] in [GitLab Premium][eep] 10.0. Brought to GitLab Core
 in 10.7.
-It is possible to store LFS objects on a remote object storage which allows you
+It is possible to store LFS objects in remote object storage which allows you
-to offload storage to an external AWS S3 compatible service, freeing up disk
+to offload local hard disk R/W operations, and free up disk space significantly.
-space locally. You can also host your own S3 compatible storage decoupled from
+GitLab is tightly integrated with `Fog`, so you can refer to its [documentation](http://fog.io/about/provider_documentation.html)
-GitLab, with with a service such as [Minio](https://www.minio.io/).
+to check which storage services can be integrated with GitLab.
+You can also use external object storage in a private local network. For example,
+[Minio](https://www.minio.io/) is a standalone object storage service, is easy to setup, and works well with GitLab instances.
-Object storage currently transfers files first to GitLab, and then on the
+GitLab provides two different options for the uploading mechanism: "Direct upload" and "Background upload".
-object storage in a second stage. This can be done either by using a rake task
-to transfer existing objects, or in a background job after each file is received.
+**Option 1. Direct upload**
+1. User pushes an lfs file to the GitLab instance
+1. GitLab-workhorse uploads the file directly to the external object storage
+1. GitLab-workhorse notifies GitLab-rails that the upload process is complete
+**Option 2. Background upload**
+1. User pushes an lfs file to the GitLab instance
+1. GitLab-rails stores the file in the local file storage
+1. GitLab-rails then uploads the file to the external object storage asynchronously
 The following general settings are supported.
@@ -71,16 +83,50 @@ The following general settings are supported.
 The `connection` settings match those provided by [Fog](https://github.com/fog).
-| Setting | Description | Default |
+Here is a configuration example with S3.
+| Setting | Description | example |
 |---------|-------------|---------|
-| `provider` | Always `AWS` for compatible hosts | AWS |
+| `provider` | The provider name | AWS |
-| `aws_access_key_id` | AWS credentials, or compatible | |
+| `aws_access_key_id` | AWS credentials, or compatible | `ABC123DEF456` |
-| `aws_secret_access_key` | AWS credentials, or compatible | |
+| `aws_secret_access_key` | AWS credentials, or compatible | `ABC123DEF456ABC123DEF456ABC123DEF456` |
 | `region` | AWS region | us-east-1 |
 | `host` | S3 compatible host for when not using AWS, e.g. `localhost` or `storage.example.com` | s3.amazonaws.com |
 | `endpoint` | Can be used when configuring an S3 compatible service such as [Minio](https://www.minio.io), by entering a URL such as `http://127.0.0.1:9000` | (optional) |
 | `path_style` | Set to true to use `host/bucket_name/object` style paths instead of `bucket_name.host/object`. Leave as false for AWS S3 | false |
+Here is a configuration example with GCS.
+| Setting | Description | example |
+|---------|-------------|---------|
+| `provider` | The provider name | `Google` |
+| `google_project` | GCP project name | `gcp-project-12345` |
+| `google_client_email` | The email address of the service account | `foo@gcp-project-12345.iam.gserviceaccount.com` |
+| `google_json_key_location` | The json key path | `/path/to/gcp-project-12345-abcde.json` |
+_NOTE: The service account must have permission to access the bucket. [See more](https://cloud.google.com/storage/docs/authentication)_
+### Manual uploading to an object storage
+There are two ways to manually do the same thing as automatic uploading (described above).
+**Option 1: rake task**
+```
+$ rake gitlab:lfs:migrate
+```
+**Option 2: rails console**
+```
+$ sudo gitlab-rails console            # Login to rails console
+> # Upload LFS files manually
+> LfsObject.where(file_store: [nil, 1]).find_each do |lfs_object|
+>   lfs_object.file.migrate!(ObjectStorage::Store::REMOTE) if lfs_object.file.file.exists?
+> end
+```
 ### S3 for Omnibus installations
 On Omnibus installations, the settings are prefixed by `lfs_object_store_`:
@@ -156,6 +202,29 @@ You can see the total storage used for LFS objects on groups and projects
 in the administration area, as well as through the [groups](../../api/groups.md)
 and [projects APIs](../../api/projects.md).
+## Troubleshooting: `Google::Apis::TransmissionError: execution expired`
+If LFS integration is configred with Google Cloud Storage and background uploads (`background_upload: true` and `direct_upload: false`),
+sidekiq workers may encouter this error. This is because the uploading timed out with very large files.
+LFS files up to 6Gb can be uploaded without any extra steps, otherwise you need to use the following workaround.
+```shell
+$ sudo gitlab-rails console            # Login to rails console
+> # Set up timeouts. 20 minutes is enough to upload 30GB LFS files.
+> # These settings are only in effect for the same session, i.e. they are not effective for sidekiq workers.
+> ::Google::Apis::ClientOptions.default.open_timeout_sec = 1200
+> ::Google::Apis::ClientOptions.default.read_timeout_sec = 1200
+> ::Google::Apis::ClientOptions.default.send_timeout_sec = 1200
+> # Upload LFS files manually. This process does not use sidekiq at all.
+> LfsObject.where(file_store: [nil, 1]).find_each do |lfs_object|
+>   lfs_object.file.migrate!(ObjectStorage::Store::REMOTE) if lfs_object.file.file.exists?
+> end
+```
+See more information in [!19581](https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/19581)
 ## Known limitations
 * Support for removing unreferenced LFS objects was added in 8.14 onwards.

--- a/lib/gitlab/git/repository.rb
+++ b/lib/gitlab/git/repository.rb
@@ -607,17 +607,7 @@ module Gitlab
      def ref_name_for_sha(ref_path, sha)
        raise ArgumentError, "sha can't be empty" unless sha.present?
-        gitaly_migrate(:find_ref_name) do |is_enabled|
+        gitaly_ref_client.find_ref_name(sha, ref_path)
-          if is_enabled
-            gitaly_ref_client.find_ref_name(sha, ref_path)
-          else
-            args = %W(for-each-ref --count=1 #{ref_path} --contains #{sha})
-            # Not found -> ["", 0]
-            # Found -> ["b8d95eb4969eefacb0a58f6a28f6803f8070e7b9 commit\trefs/environments/production/77\n", 0]
-            run_git(args).first.split.last
-          end
-        end
      end
      # Get refs hash which key is is the commit id
@@ -1816,14 +1806,6 @@ module Gitlab
        commit(sha)
      end
-      def size_by_shelling_out
-        popen(%w(du -sk), path).first.strip.to_i
-      end
-      def size_by_gitaly
-        gitaly_repository_client.repository_size
-      end
      # Returns true if the given ref name exists
      #
      # Ref names must start with `refs/`.

--- a/lib/gitlab/metrics/samplers/ruby_sampler.rb
+++ b/lib/gitlab/metrics/samplers/ruby_sampler.rb
@@ -22,27 +22,27 @@ module Gitlab
        def init_metrics
          metrics = {}
-          metrics[:sampler_duration] = Metrics.histogram(with_prefix(:sampler_duration, :seconds), 'Sampler time', { worker: nil })
+          metrics[:sampler_duration] = Metrics.counter(with_prefix(:sampler, :duration_seconds_total), 'Sampler time', labels)
-          metrics[:total_time] = Metrics.gauge(with_prefix(:gc, :time_total), 'Total GC time', labels, :livesum)
+          metrics[:total_time] = Metrics.counter(with_prefix(:gc, :duration_seconds_total), 'Total GC time', labels)
          GC.stat.keys.each do |key|
-            metrics[key] = Metrics.gauge(with_prefix(:gc, key), to_doc_string(key), labels, :livesum)
+            metrics[key] = Metrics.gauge(with_prefix(:gc_stat, key), to_doc_string(key), labels, :livesum)
          end
-          metrics[:objects_total] = Metrics.gauge(with_prefix(:objects, :total), 'Objects total', labels.merge(class: nil), :livesum)
+          metrics[:memory_usage] = Metrics.gauge(with_prefix(:memory, :bytes), 'Memory used', labels, :livesum)
-          metrics[:memory_usage] = Metrics.gauge(with_prefix(:memory, :usage_total), 'Memory used total', labels, :livesum)
+          metrics[:file_descriptors] = Metrics.gauge(with_prefix(:file, :descriptors), 'File descriptors used', labels, :livesum)
-          metrics[:file_descriptors] = Metrics.gauge(with_prefix(:file, :descriptors_total), 'File descriptors total', labels, :livesum)
          metrics
        end
        def sample
          start_time = System.monotonic_time
-          sample_gc
-          metrics[:memory_usage].set(labels, System.memory_usage)
+          metrics[:memory_usage].set(labels.merge(worker_label), System.memory_usage)
-          metrics[:file_descriptors].set(labels, System.file_descriptor_count)
+          metrics[:file_descriptors].set(labels.merge(worker_label), System.file_descriptor_count)
+          sample_gc
-          metrics[:sampler_duration].observe(labels.merge(worker_label), System.monotonic_time - start_time)
+          metrics[:sampler_duration].increment(labels, System.monotonic_time - start_time)
        ensure
          GC::Profiler.clear
        end
@@ -50,11 +50,13 @@ module Gitlab
        private
        def sample_gc
-          metrics[:total_time].set(labels, GC::Profiler.total_time * 1000)
+          # Collect generic GC stats.
          GC.stat.each do |key, value|
            metrics[key].set(labels, value)
          end
+          # Collect the GC time since last sample in float seconds.
+          metrics[:total_time].increment(labels, GC::Profiler.total_time)
        end
        def worker_label

--- a/rubocop/cop/migration/update_large_table.rb
+++ b/rubocop/cop/migration/update_large_table.rb
@@ -20,10 +20,14 @@ module RuboCop
              'necessary'.freeze
        LARGE_TABLES = %i[
-          ci_pipelines
+          ci_build_trace_sections
          ci_builds
+          ci_job_artifacts
+          ci_pipelines
+          ci_stages
          events
          issues
+          merge_request_diff_commits
          merge_request_diff_files
          merge_request_diffs
          merge_requests
@@ -34,8 +38,15 @@ module RuboCop
          users
        ].freeze
+        BATCH_UPDATE_METHODS = %w[
+          :add_column_with_default
+          :change_column_type_concurrently
+          :rename_column_concurrently
+          :update_column_in_batches
+        ].join(' ').freeze
        def_node_matcher :batch_update?, <<~PATTERN
-          (send nil? ${:add_column_with_default :update_column_in_batches} $(sym ...) ...)
+          (send nil? ${#{BATCH_UPDATE_METHODS}} $(sym ...) ...)
        PATTERN
        def on_send(node)

--- a/spec/lib/gitlab/metrics/samplers/ruby_sampler_spec.rb
+++ b/spec/lib/gitlab/metrics/samplers/ruby_sampler_spec.rb
@@ -45,7 +45,7 @@ describe Gitlab::Metrics::Samplers::RubySampler do
    it 'adds a metric containing garbage collection time statistics' do
      expect(GC::Profiler).to receive(:total_time).and_return(0.24)
-      expect(sampler.metrics[:total_time]).to receive(:set).with({}, 240)
+      expect(sampler.metrics[:total_time]).to receive(:increment).with({}, 0.24)
      sampler.sample
    end

--- a/spec/models/ci/build_trace_chunk_spec.rb
+++ b/spec/models/ci/build_trace_chunk_spec.rb
@@ -125,14 +125,6 @@ describe Ci::BuildTraceChunk, :clean_gitlab_redis_shared_state do
        end
      end
    end
-    context 'when data_store is others' do
-      before do
-        build_trace_chunk.send(:write_attribute, :data_store, -1)
-      end
-      it { expect { subject }.to raise_error('Unsupported data store') }
-    end
  end
  describe '#truncate' do

--- a/spec/rubocop/cop/migration/update_large_table_spec.rb
+++ b/spec/rubocop/cop/migration/update_large_table_spec.rb
@@ -32,6 +32,14 @@ describe RuboCop::Cop::Migration::UpdateLargeTable do
      include_examples 'large tables', 'add_column_with_default'
    end
+    context 'for the change_column_type_concurrently method' do
+      include_examples 'large tables', 'change_column_type_concurrently'
+    end
+    context 'for the rename_column_concurrently method' do
+      include_examples 'large tables', 'rename_column_concurrently'
+    end
    context 'for the update_column_in_batches method' do
      include_examples 'large tables', 'update_column_in_batches'
    end
@@ -60,6 +68,18 @@ describe RuboCop::Cop::Migration::UpdateLargeTable do
      expect(cop.offenses).to be_empty
    end
+    it 'registers no offense for change_column_type_concurrently' do
+      inspect_source("change_column_type_concurrently :#{table}, :column, default: true")
+      expect(cop.offenses).to be_empty
+    end
+    it 'registers no offense for update_column_in_batches' do
+      inspect_source("rename_column_concurrently :#{table}, :column, default: true")
+      expect(cop.offenses).to be_empty
+    end
    it 'registers no offense for update_column_in_batches' do
      inspect_source("add_column_with_default :#{table}, :column, default: true")