Commit a348f748 authored by Alex Kalderimis's avatar Alex Kalderimis

Merge branch 'mwaw/add_time_frame_attribute_to_aggregated_metrics' into 'master'

Add time frame attribute to aggregated metric

See merge request gitlab-org/gitlab!54570
parents 8e7780e2 5f51a17c
......@@ -934,6 +934,10 @@ To add data for aggregated metrics into Usage Ping payload you should add corres
- `operator`: Operator that defines how the aggregated metric data is counted. Available operators are:
- `OR`: Removes duplicates and counts all entries that triggered any of listed events.
- `AND`: Removes duplicates and counts all elements that were observed triggering all of following events.
- `time_frame`: One or more valid time frames. Use these to limit the data included in aggregated metric to events within a specific date-range. Valid time frames are:
- `7d`: Last seven days of data.
- `28d`: Last twenty eight days of data.
- `all`: All historical data, only available for `database` sourced aggregated metrics.
- `source`: Data source used to collect all events data included in aggregated metric. Valid data sources are:
- [`database`](#database-sourced-aggregated-metrics)
- [`redis`](#redis-sourced-aggregated-metrics)
......@@ -949,18 +953,24 @@ To add data for aggregated metrics into Usage Ping payload you should add corres
Example aggregated metric entries:
```yaml
- name: product_analytics_test_metrics_union_redis_sourced
- name: example_metrics_union
operator: OR
events: ['i_search_total', 'i_search_advanced', 'i_search_paid']
source: redis
- name: product_analytics_test_metrics_intersection_with_feautre_flag_database_sourced
time_frame:
- 7d
- 28d
- name: example_metrics_intersection
operator: AND
source: database
time_frame:
- 28d
- all
events: ['dependency_scanning_pipeline_all_time', 'container_scanning_pipeline_all_time']
feature_flag: example_aggregated_metric
```
Aggregated metrics are added under `aggregated_metrics` key in both `counts_weekly` and `counts_monthly` top level keys in Usage Ping payload.
Aggregated metrics collected in `7d` and `28d` time frames are added into Usage Ping payload under the `aggregated_metrics` sub-key in the `counts_weekly` and `counts_monthly` top level keys.
```ruby
{
......@@ -973,14 +983,35 @@ Aggregated metrics are added under `aggregated_metrics` key in both `counts_week
:project_snippets => 407,
:promoted_issues => 719,
:aggregated_metrics => {
:product_analytics_test_metrics_union => 7,
:product_analytics_test_metrics_intersection_with_feautre_flag => 2
:example_metrics_union => 7,
:example_metrics_intersection => 2
},
:snippets => 2513
}
}
```
Aggregated metrics for `all` time frame are present in the `count` top level key, with the `aggregate_` prefix added to their name.
For example:
`example_metrics_intersection`
Becomes:
`counts.aggregate_example_metrics_intersection`
```ruby
{
:counts => {
:deployments => 11003,
:successful_deployments => 178,
:failed_deployments => 1275,
:aggregate_example_metrics_intersection => 12
}
}
```
### Redis sourced aggregated metrics
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/45979) in GitLab 13.6.
......@@ -992,6 +1023,7 @@ you must fulfill the following requirements:
[`known_events/*.yml`](#known-events-are-added-automatically-in-usage-data-payload) files.
1. All events listed at `events` attribute must have the same `redis_slot` attribute.
1. All events listed at `events` attribute must have the same `aggregation` attribute.
1. `time_frame` does not include `all` value, which is unavailable for Redis sourced aggregated metrics.
### Database sourced aggregated metrics
......@@ -1051,17 +1083,24 @@ end
#### Add new aggregated metric definition
After all metrics are persisted, you can add an aggregated metric definition at
[`aggregated_metrics/`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/aggregated_metrics/). When adding definitions for metrics names listed in the
`events:` attribute, use the same names you passed in the `metric_name` argument
while persisting metrics in previous step.
[`aggregated_metrics/`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/aggregated_metrics/).
To declare the aggregate of metrics collected with [Estimated Batch Counters](#estimated-batch-counters),
you must fulfill the following requirements:
- Metrics names listed in the `events:` attribute, have to use the same names you passed in the `metric_name` argument while persisting metrics in previous step.
- Every metric listed in the `events:` attribute, has to be persisted for **every** selected `time_frame:` value.
Example definition:
```yaml
- name: product_analytics_test_metrics_intersection_database_sourced
- name: example_metrics_intersection_database_sourced
operator: AND
source: database
events: ['dependency_scanning_pipeline', 'container_scanning_pipeline']
time_frame:
- 28d
- all
```
## Example Usage Ping payload
......
......@@ -11,6 +11,7 @@ module Gitlab
AggregatedMetricError = Class.new(StandardError)
UnknownAggregationOperator = Class.new(AggregatedMetricError)
UnknownAggregationSource = Class.new(AggregatedMetricError)
DisallowedAggregationTimeFrame = Class.new(AggregatedMetricError)
DATABASE_SOURCE = 'database'
REDIS_SOURCE = 'redis'
......@@ -30,25 +31,38 @@ module Gitlab
@recorded_at = recorded_at
end
def all_time_data
aggregated_metrics_data(start_date: nil, end_date: nil, time_frame: Gitlab::Utils::UsageData::ALL_TIME_TIME_FRAME_NAME)
end
def monthly_data
aggregated_metrics_data(**monthly_time_range)
aggregated_metrics_data(**monthly_time_range.merge(time_frame: Gitlab::Utils::UsageData::TWENTY_EIGHT_DAYS_TIME_FRAME_NAME))
end
def weekly_data
aggregated_metrics_data(**weekly_time_range)
aggregated_metrics_data(**weekly_time_range.merge(time_frame: Gitlab::Utils::UsageData::SEVEN_DAYS_TIME_FRAME_NAME))
end
private
attr_accessor :aggregated_metrics, :recorded_at
def aggregated_metrics_data(start_date:, end_date:)
def aggregated_metrics_data(start_date:, end_date:, time_frame:)
aggregated_metrics.each_with_object({}) do |aggregation, data|
next if aggregation[:feature_flag] && Feature.disabled?(aggregation[:feature_flag], default_enabled: :yaml, type: :development)
next unless aggregation[:time_frame].include?(time_frame)
case aggregation[:source]
when REDIS_SOURCE
data[aggregation[:name]] = calculate_count_for_aggregation(aggregation: aggregation, start_date: start_date, end_date: end_date)
if time_frame == Gitlab::Utils::UsageData::ALL_TIME_TIME_FRAME_NAME
data[aggregation[:name]] = Gitlab::Utils::UsageData::FALLBACK
Gitlab::ErrorTracking
.track_and_raise_for_dev_exception(
DisallowedAggregationTimeFrame.new("Aggregation time frame: 'all' is not allowed for aggregation with source: '#{REDIS_SOURCE}'")
)
else
data[aggregation[:name]] = calculate_count_for_aggregation(aggregation: aggregation, start_date: start_date, end_date: end_date)
end
when DATABASE_SOURCE
next unless Feature.enabled?('database_sourced_aggregated_metrics', default_enabled: false, type: :development)
......
......@@ -55,15 +55,15 @@ module Gitlab
end
def time_period_to_human_name(time_period)
return Gitlab::Utils::UsageData::ALL_TIME_PERIOD_HUMAN_NAME if time_period.blank?
return Gitlab::Utils::UsageData::ALL_TIME_TIME_FRAME_NAME if time_period.blank?
start_date = time_period.first.to_date
end_date = time_period.last.to_date
if (end_date - start_date).to_i > 7
Gitlab::Utils::UsageData::MONTHLY_PERIOD_HUMAN_NAME
Gitlab::Utils::UsageData::TWENTY_EIGHT_DAYS_TIME_FRAME_NAME
else
Gitlab::Utils::UsageData::WEEKLY_PERIOD_HUMAN_NAME
Gitlab::Utils::UsageData::SEVEN_DAYS_TIME_FRAME_NAME
end
end
end
......
......@@ -60,6 +60,7 @@ module Gitlab
.merge(compliance_unique_visits_data)
.merge(search_unique_visits_data)
.merge(redis_hll_counters)
.deep_merge(aggregated_metrics_data)
end
end
......@@ -224,8 +225,7 @@ module Gitlab
project_snippets: count(ProjectSnippet.where(last_28_days_time_period)),
projects_with_alerts_created: distinct_count(::AlertManagement::Alert.where(last_28_days_time_period), :project_id)
}.merge(
snowplow_event_counts(last_28_days_time_period(column: :collector_tstamp)),
aggregated_metrics_monthly
snowplow_event_counts(last_28_days_time_period(column: :collector_tstamp))
).tap do |data|
data[:snippets] = add(data[:personal_snippets], data[:project_snippets])
end
......@@ -250,10 +250,7 @@ module Gitlab
def system_usage_data_weekly
{
counts_weekly: {
}.merge(
aggregated_metrics_weekly
)
counts_weekly: {}
}
end
......@@ -713,15 +710,13 @@ module Gitlab
{ redis_hll_counters: ::Gitlab::UsageDataCounters::HLLRedisCounter.unique_events_data }
end
def aggregated_metrics_monthly
{
aggregated_metrics: aggregated_metrics.monthly_data
}
end
def aggregated_metrics_weekly
def aggregated_metrics_data
{
aggregated_metrics: aggregated_metrics.weekly_data
counts_weekly: { aggregated_metrics: aggregated_metrics.weekly_data },
counts_monthly: { aggregated_metrics: aggregated_metrics.monthly_data },
counts: aggregated_metrics
.all_time_data
.to_h { |key, value| ["aggregate_#{key}".to_sym, value.round] }
}
end
......
......@@ -11,6 +11,7 @@
operator: OR
feature_flag: usage_data_code_review_aggregation
source: redis
time_frame: [7d, 28d]
events: [
'i_code_review_user_single_file_diffs',
'i_code_review_user_create_mr',
......@@ -54,6 +55,7 @@
operator: OR
feature_flag: usage_data_code_review_aggregation
source: redis
time_frame: [7d, 28d]
events: [
'i_code_review_user_single_file_diffs',
'i_code_review_user_create_mr',
......@@ -96,6 +98,7 @@
operator: OR
feature_flag: usage_data_code_review_aggregation
source: redis
time_frame: [7d, 28d]
events: [
'i_code_review_user_vs_code_api_request'
]
......@@ -7,6 +7,10 @@
# source: defines which datasource will be used to locate events that should be included in aggregated metric. Valid values are:
# - database
# - redis
# time_frame: defines time frames for aggregated metrics:
# - 7d - last 7 days
# - 28d - last 28 days
# - all - all historical available data, this time frame is not available for redis source
# feature_flag: name of development feature flag that will be checked before metrics aggregation is performed.
# Corresponding feature flag should have `default_enabled` attribute set to `false`.
# This attribute is OPTIONAL and can be omitted, when `feature_flag` is missing no feature flag will be checked.
......@@ -14,18 +18,22 @@
- name: compliance_features_track_unique_visits_union
operator: OR
source: redis
time_frame: [7d, 28d]
events: ['g_compliance_audit_events', 'g_compliance_dashboard', 'i_compliance_audit_events', 'a_compliance_audit_events_api', 'i_compliance_credential_inventory']
- name: product_analytics_test_metrics_union
operator: OR
source: redis
time_frame: [7d, 28d]
events: ['i_search_total', 'i_search_advanced', 'i_search_paid']
- name: product_analytics_test_metrics_intersection
operator: AND
source: redis
time_frame: [7d, 28d]
events: ['i_search_total', 'i_search_advanced', 'i_search_paid']
- name: incident_management_alerts_total_unique_counts
operator: OR
source: redis
time_frame: [7d, 28d]
events: [
'incident_management_alert_status_changed',
'incident_management_alert_assigned',
......@@ -35,6 +43,7 @@
- name: incident_management_incidents_total_unique_counts
operator: OR
source: redis
time_frame: [7d, 28d]
events: [
'incident_management_incident_created',
'incident_management_incident_reopened',
......@@ -51,10 +60,11 @@
- name: i_testing_paid_monthly_active_user_total
operator: OR
source: redis
time_frame: [7d, 28d]
events: [
'i_testing_web_performance_widget_total',
'i_testing_full_code_quality_report_total',
'i_testing_group_code_coverage_visit_total',
'i_testing_load_performance_widget_total',
'i_testing_metrics_report_widget_total'
]
'i_testing_web_performance_widget_total',
'i_testing_full_code_quality_report_total',
'i_testing_group_code_coverage_visit_total',
'i_testing_load_performance_widget_total',
'i_testing_metrics_report_widget_total'
]
......@@ -40,9 +40,9 @@ module Gitlab
FALLBACK = -1
DISTRIBUTED_HLL_FALLBACK = -2
ALL_TIME_PERIOD_HUMAN_NAME = "all_time"
WEEKLY_PERIOD_HUMAN_NAME = "weekly"
MONTHLY_PERIOD_HUMAN_NAME = "monthly"
ALL_TIME_TIME_FRAME_NAME = "all"
SEVEN_DAYS_TIME_FRAME_NAME = "7d"
TWENTY_EIGHT_DAYS_TIME_FRAME_NAME = "28d"
def count(relation, column = nil, batch: true, batch_size: nil, start: nil, finish: nil)
if batch
......
......@@ -9,10 +9,50 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
let(:entity4) { '8b9a2671-2abf-4bec-a682-22f6a8f7bf31' }
let(:end_date) { Date.current }
let(:sources) { Gitlab::Usage::Metrics::Aggregates::Sources }
let(:namespace) { described_class.to_s.deconstantize.constantize }
let_it_be(:recorded_at) { Time.current.to_i }
def aggregated_metric(name:, time_frame:, source: "redis", events: %w[event1 event2 event3], operator: "OR", feature_flag: nil)
{
name: name,
source: source,
events: events,
operator: operator,
time_frame: time_frame,
feature_flag: feature_flag
}.compact.with_indifferent_access
end
context 'aggregated_metrics_data' do
shared_examples 'db sourced aggregated metrics without database_sourced_aggregated_metrics feature' do
before do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics).and_return(aggregated_metrics)
end
end
context 'with disabled database_sourced_aggregated_metrics feature flag' do
before do
stub_feature_flags(database_sourced_aggregated_metrics: false)
end
let(:aggregated_metrics) do
[
aggregated_metric(name: "gmau_2", source: "database", time_frame: time_frame)
]
end
it 'skips database sourced metrics', :aggregate_failures do
results = {}
params = { start_date: start_date, end_date: end_date, recorded_at: recorded_at }
expect(sources::PostgresHll).not_to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3]))
expect(aggregated_metrics_data).to eq(results)
end
end
end
shared_examples 'aggregated_metrics_data' do
context 'no aggregated metric is defined' do
it 'returns empty hash' do
......@@ -31,37 +71,13 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
end
end
context 'with disabled database_sourced_aggregated_metrics feature flag' do
before do
stub_feature_flags(database_sourced_aggregated_metrics: false)
end
let(:aggregated_metrics) do
[
{ name: 'gmau_1', source: 'redis', events: %w[event3 event5], operator: "OR" },
{ name: 'gmau_2', source: 'database', events: %w[event1 event2 event3], operator: "OR" }
].map(&:with_indifferent_access)
end
it 'skips database sourced metrics', :aggregate_failures do
results = {
'gmau_1' => 5
}
params = { start_date: start_date, end_date: end_date, recorded_at: recorded_at }
expect(sources::RedisHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event3 event5])).and_return(5)
expect(sources::PostgresHll).not_to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3]))
expect(aggregated_metrics_data).to eq(results)
end
end
context 'with AND operator' do
let(:aggregated_metrics) do
params = { source: datasource, operator: "AND", time_frame: time_frame }
[
{ name: 'gmau_1', source: 'redis', events: %w[event3 event5], operator: "AND" },
{ name: 'gmau_2', source: 'database', events: %w[event1 event2 event3], operator: "AND" }
].map(&:with_indifferent_access)
aggregated_metric(**params.merge(name: "gmau_1", events: %w[event3 event5])),
aggregated_metric(**params.merge(name: "gmau_2"))
]
end
it 'returns the number of unique events recorded for every metric in aggregate', :aggregate_failures do
......@@ -73,30 +89,30 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
# gmau_1 data is as follow
# |A| => 4
expect(sources::RedisHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event3')).and_return(4)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event3')).and_return(4)
# |B| => 6
expect(sources::RedisHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event5')).and_return(6)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event5')).and_return(6)
# |A + B| => 8
expect(sources::RedisHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event3 event5])).and_return(8)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event3 event5])).and_return(8)
# Exclusion inclusion principle formula to calculate intersection of 2 sets
# |A & B| = (|A| + |B|) - |A + B| => (4 + 6) - 8 => 2
# gmau_2 data is as follow:
# |A| => 2
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event1')).and_return(2)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event1')).and_return(2)
# |B| => 3
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event2')).and_return(3)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event2')).and_return(3)
# |C| => 5
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event3')).and_return(5)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event3')).and_return(5)
# |A + B| => 4 therefore |A & B| = (|A| + |B|) - |A + B| => 2 + 3 - 4 => 1
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2])).and_return(4)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2])).and_return(4)
# |A + C| => 6 therefore |A & C| = (|A| + |C|) - |A + C| => 2 + 5 - 6 => 1
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event3])).and_return(6)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event3])).and_return(6)
# |B + C| => 7 therefore |B & C| = (|B| + |C|) - |B + C| => 3 + 5 - 7 => 1
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event2 event3])).and_return(7)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event2 event3])).and_return(7)
# |A + B + C| => 8
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3])).and_return(8)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3])).and_return(8)
# Exclusion inclusion principle formula to calculate intersection of 3 sets
# |A & B & C| = (|A & B| + |A & C| + |B & C|) - (|A| + |B| + |C|) + |A + B + C|
# (1 + 1 + 1) - (2 + 3 + 5) + 8 => 1
......@@ -108,20 +124,17 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
context 'with OR operator' do
let(:aggregated_metrics) do
[
{ name: 'gmau_1', source: 'redis', events: %w[event3 event5], operator: "OR" },
{ name: 'gmau_2', source: 'database', events: %w[event1 event2 event3], operator: "OR" }
].map(&:with_indifferent_access)
aggregated_metric(name: "gmau_1", source: datasource, time_frame: time_frame, operator: "OR")
]
end
it 'returns the number of unique events occurred for any metric in aggregate', :aggregate_failures do
results = {
'gmau_1' => 5,
'gmau_2' => 3
'gmau_1' => 5
}
params = { start_date: start_date, end_date: end_date, recorded_at: recorded_at }
expect(sources::RedisHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event3 event5])).and_return(5)
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3])).and_return(3)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3])).and_return(5)
expect(aggregated_metrics_data).to eq(results)
end
end
......@@ -130,21 +143,22 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
let(:enabled_feature_flag) { 'test_ff_enabled' }
let(:disabled_feature_flag) { 'test_ff_disabled' }
let(:aggregated_metrics) do
params = { source: datasource, time_frame: time_frame }
[
# represents stable aggregated metrics that has been fully released
{ name: 'gmau_without_ff', source: 'redis', events: %w[event3_slot event5_slot], operator: "OR" },
aggregated_metric(**params.merge(name: "gmau_without_ff")),
# represents new aggregated metric that is under performance testing on gitlab.com
{ name: 'gmau_enabled', source: 'redis', events: %w[event4], operator: "OR", feature_flag: enabled_feature_flag },
aggregated_metric(**params.merge(name: "gmau_enabled", feature_flag: enabled_feature_flag)),
# represents aggregated metric that is under development and shouldn't be yet collected even on gitlab.com
{ name: 'gmau_disabled', source: 'redis', events: %w[event4], operator: "OR", feature_flag: disabled_feature_flag }
].map(&:with_indifferent_access)
aggregated_metric(**params.merge(name: "gmau_disabled", feature_flag: disabled_feature_flag))
]
end
it 'does not calculate data for aggregates with ff turned off' do
skip_feature_flags_yaml_validation
skip_default_enabled_yaml_check
stub_feature_flags(enabled_feature_flag => true, disabled_feature_flag => false)
allow(sources::RedisHll).to receive(:calculate_metrics_union).and_return(6)
allow(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).and_return(6)
expect(aggregated_metrics_data).to eq('gmau_without_ff' => 6, 'gmau_enabled' => 6)
end
......@@ -156,31 +170,29 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
it 'raises error when unknown aggregation operator is used' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'redis', events: %w[event1_slot], operator: "SUM" }])
.and_return([aggregated_metric(name: 'gmau_1', source: datasource, operator: "SUM", time_frame: time_frame)])
end
expect { aggregated_metrics_data }.to raise_error Gitlab::Usage::Metrics::Aggregates::UnknownAggregationOperator
expect { aggregated_metrics_data }.to raise_error namespace::UnknownAggregationOperator
end
it 'raises error when unknown aggregation source is used' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'whoami', events: %w[event1_slot], operator: "AND" }])
.and_return([aggregated_metric(name: 'gmau_1', source: 'whoami', time_frame: time_frame)])
end
expect { aggregated_metrics_data }.to raise_error Gitlab::Usage::Metrics::Aggregates::UnknownAggregationSource
expect { aggregated_metrics_data }.to raise_error namespace::UnknownAggregationSource
end
it 're raises Gitlab::UsageDataCounters::HLLRedisCounter::EventError' do
error = Gitlab::UsageDataCounters::HLLRedisCounter::EventError
allow(Gitlab::UsageDataCounters::HLLRedisCounter).to receive(:calculate_events_union).and_raise(error)
it 'raises error when union is missing' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'redis', events: %w[event1_slot], operator: "OR" }])
.and_return([aggregated_metric(name: 'gmau_1', source: datasource, time_frame: time_frame)])
end
allow(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).and_raise(sources::UnionNotAvailable)
expect { aggregated_metrics_data }.to raise_error error
expect { aggregated_metrics_data }.to raise_error sources::UnionNotAvailable
end
end
......@@ -192,7 +204,7 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
it 'rescues unknown aggregation operator error' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'redis', events: %w[event1_slot], operator: "SUM" }])
.and_return([aggregated_metric(name: 'gmau_1', source: datasource, operator: "SUM", time_frame: time_frame)])
end
expect(aggregated_metrics_data).to eq('gmau_1' => -1)
......@@ -201,20 +213,91 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
it 'rescues unknown aggregation source error' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'whoami', events: %w[event1_slot], operator: "AND" }])
.and_return([aggregated_metric(name: 'gmau_1', source: 'whoami', time_frame: time_frame)])
end
expect(aggregated_metrics_data).to eq('gmau_1' => -1)
end
it 'rescues Gitlab::UsageDataCounters::HLLRedisCounter::EventError' do
error = Gitlab::UsageDataCounters::HLLRedisCounter::EventError
allow(Gitlab::UsageDataCounters::HLLRedisCounter).to receive(:calculate_events_union).and_raise(error)
it 'rescues error when union is missing' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'redis', events: %w[event1_slot], operator: "OR" }])
.and_return([aggregated_metric(name: 'gmau_1', source: datasource, time_frame: time_frame)])
end
allow(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).and_raise(sources::UnionNotAvailable)
expect(aggregated_metrics_data).to eq('gmau_1' => -1)
end
end
end
end
shared_examples 'database_sourced_aggregated_metrics' do
let(:datasource) { namespace::DATABASE_SOURCE }
it_behaves_like 'aggregated_metrics_data'
end
shared_examples 'redis_sourced_aggregated_metrics' do
let(:datasource) { namespace::REDIS_SOURCE }
it_behaves_like 'aggregated_metrics_data' do
context 'error handling' do
let(:aggregated_metrics) { [aggregated_metric(name: 'gmau_1', source: datasource, time_frame: time_frame)] }
let(:error) { Gitlab::UsageDataCounters::HLLRedisCounter::EventError }
before do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics).and_return(aggregated_metrics)
end
allow(Gitlab::UsageDataCounters::HLLRedisCounter).to receive(:calculate_events_union).and_raise(error)
end
context 'development and test environment' do
it 're raises Gitlab::UsageDataCounters::HLLRedisCounter::EventError' do
expect { aggregated_metrics_data }.to raise_error error
end
end
context 'production' do
it 'rescues Gitlab::UsageDataCounters::HLLRedisCounter::EventError' do
stub_rails_env('production')
expect(aggregated_metrics_data).to eq('gmau_1' => -1)
end
end
end
end
end
describe '.aggregated_metrics_all_time_data' do
subject(:aggregated_metrics_data) { described_class.new(recorded_at).all_time_data }
let(:start_date) { nil }
let(:end_date) { nil }
let(:time_frame) { ['all'] }
it_behaves_like 'database_sourced_aggregated_metrics'
it_behaves_like 'db sourced aggregated metrics without database_sourced_aggregated_metrics feature'
context 'redis sourced aggregated metrics' do
let(:aggregated_metrics) { [aggregated_metric(name: 'gmau_1', time_frame: time_frame)] }
before do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics).and_return(aggregated_metrics)
end
end
context 'development and test environment' do
it 'raises Gitlab::Usage::Metrics::Aggregates::DisallowedAggregationTimeFrame' do
expect { aggregated_metrics_data }.to raise_error namespace::DisallowedAggregationTimeFrame
end
end
context 'production env' do
it 'returns fallback value for unsupported time frame' do
stub_rails_env('production')
expect(aggregated_metrics_data).to eq('gmau_1' => -1)
end
......@@ -232,32 +315,34 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
subject(:aggregated_metrics_data) { described_class.new(recorded_at).weekly_data }
let(:start_date) { 7.days.ago.to_date }
let(:time_frame) { ['7d'] }
it_behaves_like 'aggregated_metrics_data'
it_behaves_like 'database_sourced_aggregated_metrics'
it_behaves_like 'redis_sourced_aggregated_metrics'
it_behaves_like 'db sourced aggregated metrics without database_sourced_aggregated_metrics feature'
end
describe '.aggregated_metrics_monthly_data' do
subject(:aggregated_metrics_data) { described_class.new(recorded_at).monthly_data }
let(:start_date) { 4.weeks.ago.to_date }
let(:time_frame) { ['28d'] }
it_behaves_like 'aggregated_metrics_data'
it_behaves_like 'database_sourced_aggregated_metrics'
it_behaves_like 'redis_sourced_aggregated_metrics'
it_behaves_like 'db sourced aggregated metrics without database_sourced_aggregated_metrics feature'
context 'metrics union calls' do
let(:aggregated_metrics) do
[
{ name: 'gmau_3', source: 'redis', events: %w[event1_slot event2_slot event3_slot event5_slot], operator: "AND" }
].map(&:with_indifferent_access)
end
it 'caches intermediate operations', :aggregate_failures do
events = %w[event1 event2 event3 event5]
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics).and_return(aggregated_metrics)
allow(instance).to receive(:aggregated_metrics)
.and_return([aggregated_metric(name: 'gmau_1', events: events, operator: "AND", time_frame: time_frame)])
end
params = { start_date: start_date, end_date: end_date, recorded_at: recorded_at }
aggregated_metrics[0][:events].each do |event|
events.each do |event|
expect(sources::RedisHll).to receive(:calculate_metrics_union)
.with(params.merge(metric_names: event))
.once
......@@ -265,7 +350,7 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
end
2.upto(4) do |subset_size|
aggregated_metrics[0][:events].combination(subset_size).each do |events|
events.combination(subset_size).each do |events|
expect(sources::RedisHll).to receive(:calculate_metrics_union)
.with(params.merge(metric_names: events))
.once
......
......@@ -69,7 +69,7 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Sources::PostgresHll, :clean_
it 'persists serialized data in Redis' do
Gitlab::Redis::SharedState.with do |redis|
expect(redis).to receive(:set).with("#{metric_1}_weekly-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
expect(redis).to receive(:set).with("#{metric_1}_7d-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
end
save_aggregated_metrics
......@@ -81,7 +81,7 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Sources::PostgresHll, :clean_
it 'persists serialized data in Redis' do
Gitlab::Redis::SharedState.with do |redis|
expect(redis).to receive(:set).with("#{metric_1}_monthly-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
expect(redis).to receive(:set).with("#{metric_1}_28d-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
end
save_aggregated_metrics
......@@ -93,7 +93,7 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Sources::PostgresHll, :clean_
it 'persists serialized data in Redis' do
Gitlab::Redis::SharedState.with do |redis|
expect(redis).to receive(:set).with("#{metric_1}_all_time-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
expect(redis).to receive(:set).with("#{metric_1}_all-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
end
save_aggregated_metrics
......
......@@ -23,6 +23,22 @@ RSpec.describe 'aggregated metrics' do
end
end
RSpec::Matchers.define :have_known_time_frame do
allowed_time_frames = [
Gitlab::Utils::UsageData::ALL_TIME_TIME_FRAME_NAME,
Gitlab::Utils::UsageData::TWENTY_EIGHT_DAYS_TIME_FRAME_NAME,
Gitlab::Utils::UsageData::SEVEN_DAYS_TIME_FRAME_NAME
]
match do |aggregate|
(aggregate[:time_frame] - allowed_time_frames).empty?
end
failure_message do |aggregate|
"Aggregate with name: `#{aggregate[:name]}` uses not allowed time_frame`#{aggregate[:time_frame] - allowed_time_frames}`"
end
end
let_it_be(:known_events) do
Gitlab::UsageDataCounters::HLLRedisCounter.known_events
end
......@@ -38,10 +54,18 @@ RSpec.describe 'aggregated metrics' do
expect(aggregated_metrics).to all has_known_source
end
it 'all aggregated metrics has known source' do
expect(aggregated_metrics).to all have_known_time_frame
end
aggregated_metrics&.select { |agg| agg[:source] == Gitlab::Usage::Metrics::Aggregates::REDIS_SOURCE }&.each do |aggregate|
context "for #{aggregate[:name]} aggregate of #{aggregate[:events].join(' ')}" do
let_it_be(:events_records) { known_events.select { |event| aggregate[:events].include?(event[:name]) } }
it "does not include 'all' time frame for Redis sourced aggregate" do
expect(aggregate[:time_frame]).not_to include(Gitlab::Utils::UsageData::ALL_TIME_TIME_FRAME_NAME)
end
it "only refers to known events" do
expect(aggregate[:events]).to all be_known_event
end
......
......@@ -1375,25 +1375,20 @@ RSpec.describe Gitlab::UsageData, :aggregate_failures do
end
end
describe '.aggregated_metrics_weekly' do
subject(:aggregated_metrics_payload) { described_class.aggregated_metrics_weekly }
describe '.aggregated_metrics_data' do
it 'uses ::Gitlab::Usage::Metrics::Aggregates::Aggregate methods', :aggregate_failures do
expected_payload = {
counts_weekly: { aggregated_metrics: { global_search_gmau: 123 } },
counts_monthly: { aggregated_metrics: { global_search_gmau: 456 } },
counts: { aggregate_global_search_gmau: 789 }
}
it 'uses ::Gitlab::Usage::Metrics::Aggregates::Aggregate#weekly_data', :aggregate_failures do
expect_next_instance_of(::Gitlab::Usage::Metrics::Aggregates::Aggregate) do |instance|
expect(instance).to receive(:weekly_data).and_return(global_search_gmau: 123)
expect(instance).to receive(:monthly_data).and_return(global_search_gmau: 456)
expect(instance).to receive(:all_time_data).and_return(global_search_gmau: 789)
end
expect(aggregated_metrics_payload).to eq(aggregated_metrics: { global_search_gmau: 123 })
end
end
describe '.aggregated_metrics_monthly' do
subject(:aggregated_metrics_payload) { described_class.aggregated_metrics_monthly }
it 'uses ::Gitlab::Usage::Metrics::Aggregates::Aggregate#monthly_data', :aggregate_failures do
expect_next_instance_of(::Gitlab::Usage::Metrics::Aggregates::Aggregate) do |instance|
expect(instance).to receive(:monthly_data).and_return(global_search_gmau: 123)
end
expect(aggregated_metrics_payload).to eq(aggregated_metrics: { global_search_gmau: 123 })
expect(described_class.aggregated_metrics_data).to eq(expected_payload)
end
end
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment