Commit 64c638e2 authored by Dylan Griffith's avatar Dylan Griffith

Introduce instrumentation for all Elasticsearch HTTP requests

This introduces the same pattern of instrumentation [we use for
Redis](https://gitlab.com/gitlab-org/gitlab/-/blob/b16db2726d7b429d740ef3642dda13b528fc4fda/lib/gitlab/instrumentation/redis.rb).
This MR also adds the instrumentation details to sidekiq logs like we do
with Redis.

This instruments at the lowest level by instrumenting the
elasticsearch-transport gem which is used by the elasticsearch-rails gem
and elasticsearch-api gem which are the higher level abstractions used
by GitLab. This has been tested to log requests coming from both places.

Other options considered for this solution involved following a similar
pattern to
https://github.com/elastic/elasticsearch-rails/blob/1c3226d8186af6e8446fa5091f5b27760c29f896/elasticsearch-rails/lib/elasticsearch/rails/instrumentation/log_subscriber.rb
which would extend the log subscriber pattern to be more than just
searches in the context of the elasticsearch-rails gem but this seemed
more convoluted than the approach I took since it requires monkey
patching to publish a message then a separate class to subscribe and
store those messages and so the pattern didn't seem to be beneficial
unless you plan to have multiple subscribers.

Additionally I considered using Faraday middleware to track this since
the Elasticsearch gem uses Faraday by default but again I opted not to
do that since it seemed trickier to manipulate the client config
following
https://github.com/elastic/elasticsearch-ruby/tree/master/elasticsearch-transport#transport-implementations
and implement the required middleware etc. than just prepending the
method following the same pattern as Redis.
parent 59122560
...@@ -77,6 +77,14 @@ User clone and fetch activity using HTTP transport appears in this log as `actio ...@@ -77,6 +77,14 @@ User clone and fetch activity using HTTP transport appears in this log as `actio
In addition, the log contains the originating IP address, In addition, the log contains the originating IP address,
(`remote_ip`),the user's ID (`user_id`), and username (`username`). (`remote_ip`),the user's ID (`user_id`), and username (`username`).
Some endpoints such as `/search` may make requests to Elasticsearch if using
[Advanced Global Search](../user/search/advanced_global_search.md). These will
additionally log `elasticsearch_calls` and `elasticsearch_call_duration_s`,
which correspond to:
1. `elasticsearch_calls`: total number of calls to Elasticsearch
1. `elasticsearch_duration_s`: total time taken by Elasticsearch calls
NOTE: **Note:** Starting with GitLab 12.5, if an error occurs, an NOTE: **Note:** Starting with GitLab 12.5, if an error occurs, an
`exception` field is included with `class`, `message`, and `exception` field is included with `class`, `message`, and
`backtrace`. Previous versions included an `error` field instead of `backtrace`. Previous versions included an `error` field instead of
......
# frozen_string_literal: true
require 'spec_helper'
# We don't want to interact with Elasticsearch in GitLab FOSS so we test
# this in ee/ only. The code exists in FOSS and won't do anything.
describe ::Gitlab::Instrumentation::ElasticsearchTransport, :elastic, :request_store do
describe '.increment_request_count' do
it 'increases the request count by 1' do
expect { described_class.increment_request_count }.to change(described_class, :get_request_count).by(1)
end
end
describe '.add_duration' do
it 'does not lose precision while adding' do
::Gitlab::SafeRequestStore.clear!
precision = 1.0 / (10**::Gitlab::InstrumentationHelper::DURATION_PRECISION)
2.times { described_class.add_duration(0.4 * precision) }
# 2 * 0.4 should be 0.8 and get rounded to 1
expect(described_class.query_time).to eq(1 * precision)
end
end
describe '.add_call_details' do
before do
stub_ee_application_setting(elasticsearch_search: true, elasticsearch_indexing: true)
allow(::Gitlab::PerformanceBar).to receive(:enabled_for_request?).and_return(true)
end
it 'parses and tracks the call details' do
ensure_elasticsearch_index!
::Gitlab::SafeRequestStore.clear!
create(:issue, title: "new issue")
ensure_elasticsearch_index!
request = ::Gitlab::Instrumentation::ElasticsearchTransport.detail_store.first
expect(request[:method]).to eq("POST")
expect(request[:path]).to eq("_bulk")
end
end
end
describe ::Gitlab::Instrumentation::ElasticsearchTransportInterceptor, :elastic, :request_store do
before do
allow(::Gitlab::PerformanceBar).to receive(:enabled_for_request?).and_return(true)
end
it 'tracks any requests via the Elasticsearch client' do
ensure_elasticsearch_index!
expect(::Gitlab::Instrumentation::ElasticsearchTransport.get_request_count).to be > 0
expect(::Gitlab::Instrumentation::ElasticsearchTransport.query_time).to be > 0
expect(::Gitlab::Instrumentation::ElasticsearchTransport.detail_store).not_to be_empty
end
end
# frozen_string_literal: true
require 'spec_helper'
describe Gitlab::InstrumentationHelper do
describe '.add_instrumentation_data', :request_store do
let(:payload) { {} }
subject { described_class.add_instrumentation_data(payload) }
# We don't want to interact with Elasticsearch in GitLab FOSS so we test
# this in ee/ only. The code exists in FOSS and won't do anything.
context 'when Elasticsearch calls are made', :elastic do
it 'adds Elasticsearch data' do
ensure_elasticsearch_index!
subject
expect(payload[:elasticsearch_calls]).to be > 0
expect(payload[:elasticsearch_duration_s]).to be > 0
end
end
end
end
# frozen_string_literal: true
require 'elasticsearch-transport'
module Gitlab
module Instrumentation
module ElasticsearchTransportInterceptor
def perform_request(*args)
start = Time.now
super
ensure
if ::Gitlab::SafeRequestStore.active?
duration = (Time.now - start)
::Gitlab::Instrumentation::ElasticsearchTransport.increment_request_count
::Gitlab::Instrumentation::ElasticsearchTransport.add_duration(duration)
::Gitlab::Instrumentation::ElasticsearchTransport.add_call_details(duration, args)
end
end
end
class ElasticsearchTransport
ELASTICSEARCH_REQUEST_COUNT = :elasticsearch_request_count
ELASTICSEARCH_CALL_DURATION = :elasticsearch_call_duration
ELASTICSEARCH_CALL_DETAILS = :elasticsearch_call_details
def self.get_request_count
::Gitlab::SafeRequestStore[ELASTICSEARCH_REQUEST_COUNT] || 0
end
def self.increment_request_count
::Gitlab::SafeRequestStore[ELASTICSEARCH_REQUEST_COUNT] ||= 0
::Gitlab::SafeRequestStore[ELASTICSEARCH_REQUEST_COUNT] += 1
end
def self.detail_store
::Gitlab::SafeRequestStore[ELASTICSEARCH_CALL_DETAILS] ||= []
end
def self.query_time
query_time = ::Gitlab::SafeRequestStore[ELASTICSEARCH_CALL_DURATION] || 0
query_time.round(::Gitlab::InstrumentationHelper::DURATION_PRECISION)
end
def self.add_duration(duration)
::Gitlab::SafeRequestStore[ELASTICSEARCH_CALL_DURATION] ||= 0
::Gitlab::SafeRequestStore[ELASTICSEARCH_CALL_DURATION] += duration
end
def self.add_call_details(duration, args)
return unless Gitlab::PerformanceBar.enabled_for_request?
detail_store << {
method: args[0],
path: args[1],
params: args[2],
body: args[3],
duration: duration,
backtrace: ::Gitlab::BacktraceCleaner.clean_backtrace(caller)
}
end
end
end
end
class ::Elasticsearch::Transport::Client
prepend ::Gitlab::Instrumentation::ElasticsearchTransportInterceptor
end
...@@ -4,7 +4,19 @@ module Gitlab ...@@ -4,7 +4,19 @@ module Gitlab
module InstrumentationHelper module InstrumentationHelper
extend self extend self
KEYS = %i(gitaly_calls gitaly_duration_s rugged_calls rugged_duration_s redis_calls redis_duration_s redis_read_bytes redis_write_bytes).freeze KEYS = %i(
gitaly_calls
gitaly_duration_s
rugged_calls
rugged_duration_s
redis_calls
redis_duration_s
redis_read_bytes
redis_write_bytes
elasticsearch_calls
elasticsearch_duration_s
).freeze
DURATION_PRECISION = 6 # microseconds DURATION_PRECISION = 6 # microseconds
def add_instrumentation_data(payload) def add_instrumentation_data(payload)
...@@ -35,6 +47,15 @@ module Gitlab ...@@ -35,6 +47,15 @@ module Gitlab
payload[:redis_read_bytes] = redis_read_bytes payload[:redis_read_bytes] = redis_read_bytes
payload[:redis_write_bytes] = redis_write_bytes payload[:redis_write_bytes] = redis_write_bytes
end end
# Elasticsearch integration is only available in EE but instrumentation
# only depends on the Gem which is also available in FOSS.
elasticsearch_calls = Gitlab::Instrumentation::ElasticsearchTransport.get_request_count
if elasticsearch_calls > 0
payload[:elasticsearch_calls] = elasticsearch_calls
payload[:elasticsearch_duration_s] = Gitlab::Instrumentation::ElasticsearchTransport.query_time
end
end end
# Returns the queuing duration for a Sidekiq job in seconds, as a float, if the # Returns the queuing duration for a Sidekiq job in seconds, as a float, if the
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment