Commit 5f51a17c authored by Mikolaj Wawrzyniak's avatar Mikolaj Wawrzyniak

Add time frame attribute to aggregated metrics

In order to make aggregates more flexible, and as well to avoid
calculating not needed data, we should add time frame attribute
to aggregate definition.
parent 772b9df8
......@@ -934,6 +934,10 @@ To add data for aggregated metrics into Usage Ping payload you should add corres
- `operator`: Operator that defines how the aggregated metric data is counted. Available operators are:
- `OR`: Removes duplicates and counts all entries that triggered any of listed events.
- `AND`: Removes duplicates and counts all elements that were observed triggering all of following events.
- `time_frame`: One or more valid time frames. Use these to limit the data included in aggregated metric to events within a specific date-range. Valid time frames are:
- `7d`: Last seven days of data.
- `28d`: Last twenty eight days of data.
- `all`: All historical data, only available for `database` sourced aggregated metrics.
- `source`: Data source used to collect all events data included in aggregated metric. Valid data sources are:
- [`database`](#database-sourced-aggregated-metrics)
- [`redis`](#redis-sourced-aggregated-metrics)
......@@ -949,18 +953,24 @@ To add data for aggregated metrics into Usage Ping payload you should add corres
Example aggregated metric entries:
```yaml
- name: product_analytics_test_metrics_union_redis_sourced
- name: example_metrics_union
operator: OR
events: ['i_search_total', 'i_search_advanced', 'i_search_paid']
source: redis
- name: product_analytics_test_metrics_intersection_with_feautre_flag_database_sourced
time_frame:
- 7d
- 28d
- name: example_metrics_intersection
operator: AND
source: database
time_frame:
- 28d
- all
events: ['dependency_scanning_pipeline_all_time', 'container_scanning_pipeline_all_time']
feature_flag: example_aggregated_metric
```
Aggregated metrics are added under `aggregated_metrics` key in both `counts_weekly` and `counts_monthly` top level keys in Usage Ping payload.
Aggregated metrics collected in `7d` and `28d` time frames are added into Usage Ping payload under the `aggregated_metrics` sub-key in the `counts_weekly` and `counts_monthly` top level keys.
```ruby
{
......@@ -973,14 +983,35 @@ Aggregated metrics are added under `aggregated_metrics` key in both `counts_week
:project_snippets => 407,
:promoted_issues => 719,
:aggregated_metrics => {
:product_analytics_test_metrics_union => 7,
:product_analytics_test_metrics_intersection_with_feautre_flag => 2
:example_metrics_union => 7,
:example_metrics_intersection => 2
},
:snippets => 2513
}
}
```
Aggregated metrics for `all` time frame are present in the `count` top level key, with the `aggregate_` prefix added to their name.
For example:
`example_metrics_intersection`
Becomes:
`counts.aggregate_example_metrics_intersection`
```ruby
{
:counts => {
:deployments => 11003,
:successful_deployments => 178,
:failed_deployments => 1275,
:aggregate_example_metrics_intersection => 12
}
}
```
### Redis sourced aggregated metrics
> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/45979) in GitLab 13.6.
......@@ -992,6 +1023,7 @@ you must fulfill the following requirements:
[`known_events/*.yml`](#known-events-are-added-automatically-in-usage-data-payload) files.
1. All events listed at `events` attribute must have the same `redis_slot` attribute.
1. All events listed at `events` attribute must have the same `aggregation` attribute.
1. `time_frame` does not include `all` value, which is unavailable for Redis sourced aggregated metrics.
### Database sourced aggregated metrics
......@@ -1051,17 +1083,24 @@ end
#### Add new aggregated metric definition
After all metrics are persisted, you can add an aggregated metric definition at
[`aggregated_metrics/`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/aggregated_metrics/). When adding definitions for metrics names listed in the
`events:` attribute, use the same names you passed in the `metric_name` argument
while persisting metrics in previous step.
[`aggregated_metrics/`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/aggregated_metrics/).
To declare the aggregate of metrics collected with [Estimated Batch Counters](#estimated-batch-counters),
you must fulfill the following requirements:
- Metrics names listed in the `events:` attribute, have to use the same names you passed in the `metric_name` argument while persisting metrics in previous step.
- Every metric listed in the `events:` attribute, has to be persisted for **every** selected `time_frame:` value.
Example definition:
```yaml
- name: product_analytics_test_metrics_intersection_database_sourced
- name: example_metrics_intersection_database_sourced
operator: AND
source: database
events: ['dependency_scanning_pipeline', 'container_scanning_pipeline']
time_frame:
- 28d
- all
```
## Example Usage Ping payload
......
......@@ -11,6 +11,7 @@ module Gitlab
AggregatedMetricError = Class.new(StandardError)
UnknownAggregationOperator = Class.new(AggregatedMetricError)
UnknownAggregationSource = Class.new(AggregatedMetricError)
DisallowedAggregationTimeFrame = Class.new(AggregatedMetricError)
DATABASE_SOURCE = 'database'
REDIS_SOURCE = 'redis'
......@@ -30,25 +31,38 @@ module Gitlab
@recorded_at = recorded_at
end
def all_time_data
aggregated_metrics_data(start_date: nil, end_date: nil, time_frame: Gitlab::Utils::UsageData::ALL_TIME_TIME_FRAME_NAME)
end
def monthly_data
aggregated_metrics_data(**monthly_time_range)
aggregated_metrics_data(**monthly_time_range.merge(time_frame: Gitlab::Utils::UsageData::TWENTY_EIGHT_DAYS_TIME_FRAME_NAME))
end
def weekly_data
aggregated_metrics_data(**weekly_time_range)
aggregated_metrics_data(**weekly_time_range.merge(time_frame: Gitlab::Utils::UsageData::SEVEN_DAYS_TIME_FRAME_NAME))
end
private
attr_accessor :aggregated_metrics, :recorded_at
def aggregated_metrics_data(start_date:, end_date:)
def aggregated_metrics_data(start_date:, end_date:, time_frame:)
aggregated_metrics.each_with_object({}) do |aggregation, data|
next if aggregation[:feature_flag] && Feature.disabled?(aggregation[:feature_flag], default_enabled: :yaml, type: :development)
next unless aggregation[:time_frame].include?(time_frame)
case aggregation[:source]
when REDIS_SOURCE
data[aggregation[:name]] = calculate_count_for_aggregation(aggregation: aggregation, start_date: start_date, end_date: end_date)
if time_frame == Gitlab::Utils::UsageData::ALL_TIME_TIME_FRAME_NAME
data[aggregation[:name]] = Gitlab::Utils::UsageData::FALLBACK
Gitlab::ErrorTracking
.track_and_raise_for_dev_exception(
DisallowedAggregationTimeFrame.new("Aggregation time frame: 'all' is not allowed for aggregation with source: '#{REDIS_SOURCE}'")
)
else
data[aggregation[:name]] = calculate_count_for_aggregation(aggregation: aggregation, start_date: start_date, end_date: end_date)
end
when DATABASE_SOURCE
next unless Feature.enabled?('database_sourced_aggregated_metrics', default_enabled: false, type: :development)
......
......@@ -55,15 +55,15 @@ module Gitlab
end
def time_period_to_human_name(time_period)
return Gitlab::Utils::UsageData::ALL_TIME_PERIOD_HUMAN_NAME if time_period.blank?
return Gitlab::Utils::UsageData::ALL_TIME_TIME_FRAME_NAME if time_period.blank?
start_date = time_period.first.to_date
end_date = time_period.last.to_date
if (end_date - start_date).to_i > 7
Gitlab::Utils::UsageData::MONTHLY_PERIOD_HUMAN_NAME
Gitlab::Utils::UsageData::TWENTY_EIGHT_DAYS_TIME_FRAME_NAME
else
Gitlab::Utils::UsageData::WEEKLY_PERIOD_HUMAN_NAME
Gitlab::Utils::UsageData::SEVEN_DAYS_TIME_FRAME_NAME
end
end
end
......
......@@ -60,6 +60,7 @@ module Gitlab
.merge(compliance_unique_visits_data)
.merge(search_unique_visits_data)
.merge(redis_hll_counters)
.deep_merge(aggregated_metrics_data)
end
end
......@@ -224,8 +225,7 @@ module Gitlab
project_snippets: count(ProjectSnippet.where(last_28_days_time_period)),
projects_with_alerts_created: distinct_count(::AlertManagement::Alert.where(last_28_days_time_period), :project_id)
}.merge(
snowplow_event_counts(last_28_days_time_period(column: :collector_tstamp)),
aggregated_metrics_monthly
snowplow_event_counts(last_28_days_time_period(column: :collector_tstamp))
).tap do |data|
data[:snippets] = add(data[:personal_snippets], data[:project_snippets])
end
......@@ -250,10 +250,7 @@ module Gitlab
def system_usage_data_weekly
{
counts_weekly: {
}.merge(
aggregated_metrics_weekly
)
counts_weekly: {}
}
end
......@@ -713,15 +710,13 @@ module Gitlab
{ redis_hll_counters: ::Gitlab::UsageDataCounters::HLLRedisCounter.unique_events_data }
end
def aggregated_metrics_monthly
{
aggregated_metrics: aggregated_metrics.monthly_data
}
end
def aggregated_metrics_weekly
def aggregated_metrics_data
{
aggregated_metrics: aggregated_metrics.weekly_data
counts_weekly: { aggregated_metrics: aggregated_metrics.weekly_data },
counts_monthly: { aggregated_metrics: aggregated_metrics.monthly_data },
counts: aggregated_metrics
.all_time_data
.to_h { |key, value| ["aggregate_#{key}".to_sym, value.round] }
}
end
......
......@@ -11,6 +11,7 @@
operator: OR
feature_flag: usage_data_code_review_aggregation
source: redis
time_frame: [7d, 28d]
events: [
'i_code_review_user_single_file_diffs',
'i_code_review_user_create_mr',
......@@ -54,6 +55,7 @@
operator: OR
feature_flag: usage_data_code_review_aggregation
source: redis
time_frame: [7d, 28d]
events: [
'i_code_review_user_single_file_diffs',
'i_code_review_user_create_mr',
......@@ -96,6 +98,7 @@
operator: OR
feature_flag: usage_data_code_review_aggregation
source: redis
time_frame: [7d, 28d]
events: [
'i_code_review_user_vs_code_api_request'
]
......@@ -7,6 +7,10 @@
# source: defines which datasource will be used to locate events that should be included in aggregated metric. Valid values are:
# - database
# - redis
# time_frame: defines time frames for aggregated metrics:
# - 7d - last 7 days
# - 28d - last 28 days
# - all - all historical available data, this time frame is not available for redis source
# feature_flag: name of development feature flag that will be checked before metrics aggregation is performed.
# Corresponding feature flag should have `default_enabled` attribute set to `false`.
# This attribute is OPTIONAL and can be omitted, when `feature_flag` is missing no feature flag will be checked.
......@@ -14,18 +18,22 @@
- name: compliance_features_track_unique_visits_union
operator: OR
source: redis
time_frame: [7d, 28d]
events: ['g_compliance_audit_events', 'g_compliance_dashboard', 'i_compliance_audit_events', 'a_compliance_audit_events_api', 'i_compliance_credential_inventory']
- name: product_analytics_test_metrics_union
operator: OR
source: redis
time_frame: [7d, 28d]
events: ['i_search_total', 'i_search_advanced', 'i_search_paid']
- name: product_analytics_test_metrics_intersection
operator: AND
source: redis
time_frame: [7d, 28d]
events: ['i_search_total', 'i_search_advanced', 'i_search_paid']
- name: incident_management_alerts_total_unique_counts
operator: OR
source: redis
time_frame: [7d, 28d]
events: [
'incident_management_alert_status_changed',
'incident_management_alert_assigned',
......@@ -35,6 +43,7 @@
- name: incident_management_incidents_total_unique_counts
operator: OR
source: redis
time_frame: [7d, 28d]
events: [
'incident_management_incident_created',
'incident_management_incident_reopened',
......@@ -51,10 +60,11 @@
- name: i_testing_paid_monthly_active_user_total
operator: OR
source: redis
time_frame: [7d, 28d]
events: [
'i_testing_web_performance_widget_total',
'i_testing_full_code_quality_report_total',
'i_testing_group_code_coverage_visit_total',
'i_testing_load_performance_widget_total',
'i_testing_metrics_report_widget_total'
]
'i_testing_web_performance_widget_total',
'i_testing_full_code_quality_report_total',
'i_testing_group_code_coverage_visit_total',
'i_testing_load_performance_widget_total',
'i_testing_metrics_report_widget_total'
]
......@@ -40,9 +40,9 @@ module Gitlab
FALLBACK = -1
DISTRIBUTED_HLL_FALLBACK = -2
ALL_TIME_PERIOD_HUMAN_NAME = "all_time"
WEEKLY_PERIOD_HUMAN_NAME = "weekly"
MONTHLY_PERIOD_HUMAN_NAME = "monthly"
ALL_TIME_TIME_FRAME_NAME = "all"
SEVEN_DAYS_TIME_FRAME_NAME = "7d"
TWENTY_EIGHT_DAYS_TIME_FRAME_NAME = "28d"
def count(relation, column = nil, batch: true, batch_size: nil, start: nil, finish: nil)
if batch
......
......@@ -9,10 +9,50 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
let(:entity4) { '8b9a2671-2abf-4bec-a682-22f6a8f7bf31' }
let(:end_date) { Date.current }
let(:sources) { Gitlab::Usage::Metrics::Aggregates::Sources }
let(:namespace) { described_class.to_s.deconstantize.constantize }
let_it_be(:recorded_at) { Time.current.to_i }
def aggregated_metric(name:, time_frame:, source: "redis", events: %w[event1 event2 event3], operator: "OR", feature_flag: nil)
{
name: name,
source: source,
events: events,
operator: operator,
time_frame: time_frame,
feature_flag: feature_flag
}.compact.with_indifferent_access
end
context 'aggregated_metrics_data' do
shared_examples 'db sourced aggregated metrics without database_sourced_aggregated_metrics feature' do
before do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics).and_return(aggregated_metrics)
end
end
context 'with disabled database_sourced_aggregated_metrics feature flag' do
before do
stub_feature_flags(database_sourced_aggregated_metrics: false)
end
let(:aggregated_metrics) do
[
aggregated_metric(name: "gmau_2", source: "database", time_frame: time_frame)
]
end
it 'skips database sourced metrics', :aggregate_failures do
results = {}
params = { start_date: start_date, end_date: end_date, recorded_at: recorded_at }
expect(sources::PostgresHll).not_to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3]))
expect(aggregated_metrics_data).to eq(results)
end
end
end
shared_examples 'aggregated_metrics_data' do
context 'no aggregated metric is defined' do
it 'returns empty hash' do
......@@ -31,37 +71,13 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
end
end
context 'with disabled database_sourced_aggregated_metrics feature flag' do
before do
stub_feature_flags(database_sourced_aggregated_metrics: false)
end
let(:aggregated_metrics) do
[
{ name: 'gmau_1', source: 'redis', events: %w[event3 event5], operator: "OR" },
{ name: 'gmau_2', source: 'database', events: %w[event1 event2 event3], operator: "OR" }
].map(&:with_indifferent_access)
end
it 'skips database sourced metrics', :aggregate_failures do
results = {
'gmau_1' => 5
}
params = { start_date: start_date, end_date: end_date, recorded_at: recorded_at }
expect(sources::RedisHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event3 event5])).and_return(5)
expect(sources::PostgresHll).not_to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3]))
expect(aggregated_metrics_data).to eq(results)
end
end
context 'with AND operator' do
let(:aggregated_metrics) do
params = { source: datasource, operator: "AND", time_frame: time_frame }
[
{ name: 'gmau_1', source: 'redis', events: %w[event3 event5], operator: "AND" },
{ name: 'gmau_2', source: 'database', events: %w[event1 event2 event3], operator: "AND" }
].map(&:with_indifferent_access)
aggregated_metric(**params.merge(name: "gmau_1", events: %w[event3 event5])),
aggregated_metric(**params.merge(name: "gmau_2"))
]
end
it 'returns the number of unique events recorded for every metric in aggregate', :aggregate_failures do
......@@ -73,30 +89,30 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
# gmau_1 data is as follow
# |A| => 4
expect(sources::RedisHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event3')).and_return(4)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event3')).and_return(4)
# |B| => 6
expect(sources::RedisHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event5')).and_return(6)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event5')).and_return(6)
# |A + B| => 8
expect(sources::RedisHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event3 event5])).and_return(8)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event3 event5])).and_return(8)
# Exclusion inclusion principle formula to calculate intersection of 2 sets
# |A & B| = (|A| + |B|) - |A + B| => (4 + 6) - 8 => 2
# gmau_2 data is as follow:
# |A| => 2
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event1')).and_return(2)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event1')).and_return(2)
# |B| => 3
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event2')).and_return(3)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event2')).and_return(3)
# |C| => 5
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event3')).and_return(5)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: 'event3')).and_return(5)
# |A + B| => 4 therefore |A & B| = (|A| + |B|) - |A + B| => 2 + 3 - 4 => 1
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2])).and_return(4)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2])).and_return(4)
# |A + C| => 6 therefore |A & C| = (|A| + |C|) - |A + C| => 2 + 5 - 6 => 1
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event3])).and_return(6)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event3])).and_return(6)
# |B + C| => 7 therefore |B & C| = (|B| + |C|) - |B + C| => 3 + 5 - 7 => 1
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event2 event3])).and_return(7)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event2 event3])).and_return(7)
# |A + B + C| => 8
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3])).and_return(8)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3])).and_return(8)
# Exclusion inclusion principle formula to calculate intersection of 3 sets
# |A & B & C| = (|A & B| + |A & C| + |B & C|) - (|A| + |B| + |C|) + |A + B + C|
# (1 + 1 + 1) - (2 + 3 + 5) + 8 => 1
......@@ -108,20 +124,17 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
context 'with OR operator' do
let(:aggregated_metrics) do
[
{ name: 'gmau_1', source: 'redis', events: %w[event3 event5], operator: "OR" },
{ name: 'gmau_2', source: 'database', events: %w[event1 event2 event3], operator: "OR" }
].map(&:with_indifferent_access)
aggregated_metric(name: "gmau_1", source: datasource, time_frame: time_frame, operator: "OR")
]
end
it 'returns the number of unique events occurred for any metric in aggregate', :aggregate_failures do
results = {
'gmau_1' => 5,
'gmau_2' => 3
'gmau_1' => 5
}
params = { start_date: start_date, end_date: end_date, recorded_at: recorded_at }
expect(sources::RedisHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event3 event5])).and_return(5)
expect(sources::PostgresHll).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3])).and_return(3)
expect(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).with(params.merge(metric_names: %w[event1 event2 event3])).and_return(5)
expect(aggregated_metrics_data).to eq(results)
end
end
......@@ -130,21 +143,22 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
let(:enabled_feature_flag) { 'test_ff_enabled' }
let(:disabled_feature_flag) { 'test_ff_disabled' }
let(:aggregated_metrics) do
params = { source: datasource, time_frame: time_frame }
[
# represents stable aggregated metrics that has been fully released
{ name: 'gmau_without_ff', source: 'redis', events: %w[event3_slot event5_slot], operator: "OR" },
aggregated_metric(**params.merge(name: "gmau_without_ff")),
# represents new aggregated metric that is under performance testing on gitlab.com
{ name: 'gmau_enabled', source: 'redis', events: %w[event4], operator: "OR", feature_flag: enabled_feature_flag },
aggregated_metric(**params.merge(name: "gmau_enabled", feature_flag: enabled_feature_flag)),
# represents aggregated metric that is under development and shouldn't be yet collected even on gitlab.com
{ name: 'gmau_disabled', source: 'redis', events: %w[event4], operator: "OR", feature_flag: disabled_feature_flag }
].map(&:with_indifferent_access)
aggregated_metric(**params.merge(name: "gmau_disabled", feature_flag: disabled_feature_flag))
]
end
it 'does not calculate data for aggregates with ff turned off' do
skip_feature_flags_yaml_validation
skip_default_enabled_yaml_check
stub_feature_flags(enabled_feature_flag => true, disabled_feature_flag => false)
allow(sources::RedisHll).to receive(:calculate_metrics_union).and_return(6)
allow(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).and_return(6)
expect(aggregated_metrics_data).to eq('gmau_without_ff' => 6, 'gmau_enabled' => 6)
end
......@@ -156,31 +170,29 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
it 'raises error when unknown aggregation operator is used' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'redis', events: %w[event1_slot], operator: "SUM" }])
.and_return([aggregated_metric(name: 'gmau_1', source: datasource, operator: "SUM", time_frame: time_frame)])
end
expect { aggregated_metrics_data }.to raise_error Gitlab::Usage::Metrics::Aggregates::UnknownAggregationOperator
expect { aggregated_metrics_data }.to raise_error namespace::UnknownAggregationOperator
end
it 'raises error when unknown aggregation source is used' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'whoami', events: %w[event1_slot], operator: "AND" }])
.and_return([aggregated_metric(name: 'gmau_1', source: 'whoami', time_frame: time_frame)])
end
expect { aggregated_metrics_data }.to raise_error Gitlab::Usage::Metrics::Aggregates::UnknownAggregationSource
expect { aggregated_metrics_data }.to raise_error namespace::UnknownAggregationSource
end
it 're raises Gitlab::UsageDataCounters::HLLRedisCounter::EventError' do
error = Gitlab::UsageDataCounters::HLLRedisCounter::EventError
allow(Gitlab::UsageDataCounters::HLLRedisCounter).to receive(:calculate_events_union).and_raise(error)
it 'raises error when union is missing' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'redis', events: %w[event1_slot], operator: "OR" }])
.and_return([aggregated_metric(name: 'gmau_1', source: datasource, time_frame: time_frame)])
end
allow(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).and_raise(sources::UnionNotAvailable)
expect { aggregated_metrics_data }.to raise_error error
expect { aggregated_metrics_data }.to raise_error sources::UnionNotAvailable
end
end
......@@ -192,7 +204,7 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
it 'rescues unknown aggregation operator error' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'redis', events: %w[event1_slot], operator: "SUM" }])
.and_return([aggregated_metric(name: 'gmau_1', source: datasource, operator: "SUM", time_frame: time_frame)])
end
expect(aggregated_metrics_data).to eq('gmau_1' => -1)
......@@ -201,20 +213,91 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
it 'rescues unknown aggregation source error' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'whoami', events: %w[event1_slot], operator: "AND" }])
.and_return([aggregated_metric(name: 'gmau_1', source: 'whoami', time_frame: time_frame)])
end
expect(aggregated_metrics_data).to eq('gmau_1' => -1)
end
it 'rescues Gitlab::UsageDataCounters::HLLRedisCounter::EventError' do
error = Gitlab::UsageDataCounters::HLLRedisCounter::EventError
allow(Gitlab::UsageDataCounters::HLLRedisCounter).to receive(:calculate_events_union).and_raise(error)
it 'rescues error when union is missing' do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics)
.and_return([{ name: 'gmau_1', source: 'redis', events: %w[event1_slot], operator: "OR" }])
.and_return([aggregated_metric(name: 'gmau_1', source: datasource, time_frame: time_frame)])
end
allow(namespace::SOURCES[datasource]).to receive(:calculate_metrics_union).and_raise(sources::UnionNotAvailable)
expect(aggregated_metrics_data).to eq('gmau_1' => -1)
end
end
end
end
shared_examples 'database_sourced_aggregated_metrics' do
let(:datasource) { namespace::DATABASE_SOURCE }
it_behaves_like 'aggregated_metrics_data'
end
shared_examples 'redis_sourced_aggregated_metrics' do
let(:datasource) { namespace::REDIS_SOURCE }
it_behaves_like 'aggregated_metrics_data' do
context 'error handling' do
let(:aggregated_metrics) { [aggregated_metric(name: 'gmau_1', source: datasource, time_frame: time_frame)] }
let(:error) { Gitlab::UsageDataCounters::HLLRedisCounter::EventError }
before do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics).and_return(aggregated_metrics)
end
allow(Gitlab::UsageDataCounters::HLLRedisCounter).to receive(:calculate_events_union).and_raise(error)
end
context 'development and test environment' do
it 're raises Gitlab::UsageDataCounters::HLLRedisCounter::EventError' do
expect { aggregated_metrics_data }.to raise_error error
end
end
context 'production' do
it 'rescues Gitlab::UsageDataCounters::HLLRedisCounter::EventError' do
stub_rails_env('production')
expect(aggregated_metrics_data).to eq('gmau_1' => -1)
end
end
end
end
end
describe '.aggregated_metrics_all_time_data' do
subject(:aggregated_metrics_data) { described_class.new(recorded_at).all_time_data }
let(:start_date) { nil }
let(:end_date) { nil }
let(:time_frame) { ['all'] }
it_behaves_like 'database_sourced_aggregated_metrics'
it_behaves_like 'db sourced aggregated metrics without database_sourced_aggregated_metrics feature'
context 'redis sourced aggregated metrics' do
let(:aggregated_metrics) { [aggregated_metric(name: 'gmau_1', time_frame: time_frame)] }
before do
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics).and_return(aggregated_metrics)
end
end
context 'development and test environment' do
it 'raises Gitlab::Usage::Metrics::Aggregates::DisallowedAggregationTimeFrame' do
expect { aggregated_metrics_data }.to raise_error namespace::DisallowedAggregationTimeFrame
end
end
context 'production env' do
it 'returns fallback value for unsupported time frame' do
stub_rails_env('production')
expect(aggregated_metrics_data).to eq('gmau_1' => -1)
end
......@@ -232,32 +315,34 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
subject(:aggregated_metrics_data) { described_class.new(recorded_at).weekly_data }
let(:start_date) { 7.days.ago.to_date }
let(:time_frame) { ['7d'] }
it_behaves_like 'aggregated_metrics_data'
it_behaves_like 'database_sourced_aggregated_metrics'
it_behaves_like 'redis_sourced_aggregated_metrics'
it_behaves_like 'db sourced aggregated metrics without database_sourced_aggregated_metrics feature'
end
describe '.aggregated_metrics_monthly_data' do
subject(:aggregated_metrics_data) { described_class.new(recorded_at).monthly_data }
let(:start_date) { 4.weeks.ago.to_date }
let(:time_frame) { ['28d'] }
it_behaves_like 'aggregated_metrics_data'
it_behaves_like 'database_sourced_aggregated_metrics'
it_behaves_like 'redis_sourced_aggregated_metrics'
it_behaves_like 'db sourced aggregated metrics without database_sourced_aggregated_metrics feature'
context 'metrics union calls' do
let(:aggregated_metrics) do
[
{ name: 'gmau_3', source: 'redis', events: %w[event1_slot event2_slot event3_slot event5_slot], operator: "AND" }
].map(&:with_indifferent_access)
end
it 'caches intermediate operations', :aggregate_failures do
events = %w[event1 event2 event3 event5]
allow_next_instance_of(described_class) do |instance|
allow(instance).to receive(:aggregated_metrics).and_return(aggregated_metrics)
allow(instance).to receive(:aggregated_metrics)
.and_return([aggregated_metric(name: 'gmau_1', events: events, operator: "AND", time_frame: time_frame)])
end
params = { start_date: start_date, end_date: end_date, recorded_at: recorded_at }
aggregated_metrics[0][:events].each do |event|
events.each do |event|
expect(sources::RedisHll).to receive(:calculate_metrics_union)
.with(params.merge(metric_names: event))
.once
......@@ -265,7 +350,7 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Aggregate, :clean_gitlab_redi
end
2.upto(4) do |subset_size|
aggregated_metrics[0][:events].combination(subset_size).each do |events|
events.combination(subset_size).each do |events|
expect(sources::RedisHll).to receive(:calculate_metrics_union)
.with(params.merge(metric_names: events))
.once
......
......@@ -69,7 +69,7 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Sources::PostgresHll, :clean_
it 'persists serialized data in Redis' do
Gitlab::Redis::SharedState.with do |redis|
expect(redis).to receive(:set).with("#{metric_1}_weekly-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
expect(redis).to receive(:set).with("#{metric_1}_7d-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
end
save_aggregated_metrics
......@@ -81,7 +81,7 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Sources::PostgresHll, :clean_
it 'persists serialized data in Redis' do
Gitlab::Redis::SharedState.with do |redis|
expect(redis).to receive(:set).with("#{metric_1}_monthly-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
expect(redis).to receive(:set).with("#{metric_1}_28d-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
end
save_aggregated_metrics
......@@ -93,7 +93,7 @@ RSpec.describe Gitlab::Usage::Metrics::Aggregates::Sources::PostgresHll, :clean_
it 'persists serialized data in Redis' do
Gitlab::Redis::SharedState.with do |redis|
expect(redis).to receive(:set).with("#{metric_1}_all_time-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
expect(redis).to receive(:set).with("#{metric_1}_all-#{recorded_at.to_i}", '{"141":1,"56":1}', ex: 120.hours)
end
save_aggregated_metrics
......
......@@ -23,6 +23,22 @@ RSpec.describe 'aggregated metrics' do
end
end
RSpec::Matchers.define :have_known_time_frame do
allowed_time_frames = [
Gitlab::Utils::UsageData::ALL_TIME_TIME_FRAME_NAME,
Gitlab::Utils::UsageData::TWENTY_EIGHT_DAYS_TIME_FRAME_NAME,
Gitlab::Utils::UsageData::SEVEN_DAYS_TIME_FRAME_NAME
]
match do |aggregate|
(aggregate[:time_frame] - allowed_time_frames).empty?
end
failure_message do |aggregate|
"Aggregate with name: `#{aggregate[:name]}` uses not allowed time_frame`#{aggregate[:time_frame] - allowed_time_frames}`"
end
end
let_it_be(:known_events) do
Gitlab::UsageDataCounters::HLLRedisCounter.known_events
end
......@@ -38,10 +54,18 @@ RSpec.describe 'aggregated metrics' do
expect(aggregated_metrics).to all has_known_source
end
it 'all aggregated metrics has known source' do
expect(aggregated_metrics).to all have_known_time_frame
end
aggregated_metrics&.select { |agg| agg[:source] == Gitlab::Usage::Metrics::Aggregates::REDIS_SOURCE }&.each do |aggregate|
context "for #{aggregate[:name]} aggregate of #{aggregate[:events].join(' ')}" do
let_it_be(:events_records) { known_events.select { |event| aggregate[:events].include?(event[:name]) } }
it "does not include 'all' time frame for Redis sourced aggregate" do
expect(aggregate[:time_frame]).not_to include(Gitlab::Utils::UsageData::ALL_TIME_TIME_FRAME_NAME)
end
it "only refers to known events" do
expect(aggregate[:events]).to all be_known_event
end
......
......@@ -1375,25 +1375,20 @@ RSpec.describe Gitlab::UsageData, :aggregate_failures do
end
end
describe '.aggregated_metrics_weekly' do
subject(:aggregated_metrics_payload) { described_class.aggregated_metrics_weekly }
describe '.aggregated_metrics_data' do
it 'uses ::Gitlab::Usage::Metrics::Aggregates::Aggregate methods', :aggregate_failures do
expected_payload = {
counts_weekly: { aggregated_metrics: { global_search_gmau: 123 } },
counts_monthly: { aggregated_metrics: { global_search_gmau: 456 } },
counts: { aggregate_global_search_gmau: 789 }
}
it 'uses ::Gitlab::Usage::Metrics::Aggregates::Aggregate#weekly_data', :aggregate_failures do
expect_next_instance_of(::Gitlab::Usage::Metrics::Aggregates::Aggregate) do |instance|
expect(instance).to receive(:weekly_data).and_return(global_search_gmau: 123)
expect(instance).to receive(:monthly_data).and_return(global_search_gmau: 456)
expect(instance).to receive(:all_time_data).and_return(global_search_gmau: 789)
end
expect(aggregated_metrics_payload).to eq(aggregated_metrics: { global_search_gmau: 123 })
end
end
describe '.aggregated_metrics_monthly' do
subject(:aggregated_metrics_payload) { described_class.aggregated_metrics_monthly }
it 'uses ::Gitlab::Usage::Metrics::Aggregates::Aggregate#monthly_data', :aggregate_failures do
expect_next_instance_of(::Gitlab::Usage::Metrics::Aggregates::Aggregate) do |instance|
expect(instance).to receive(:monthly_data).and_return(global_search_gmau: 123)
end
expect(aggregated_metrics_payload).to eq(aggregated_metrics: { global_search_gmau: 123 })
expect(described_class.aggregated_metrics_data).to eq(expected_payload)
end
end
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment