- 08 Nov, 2017 12 commits
-
-
Yorick Peterse authored
-
Yorick Peterse authored
Prior to this MR there were two GitHub related importers: * Github::Import: the main importer used for GitHub projects * Gitlab::GithubImport: importer that's somewhat confusingly used for importing Gitea projects (apparently they have a compatible API) This MR renames the Gitea importer to Gitlab::LegacyGithubImport and introduces a new GitHub importer in the Gitlab::GithubImport namespace. This new GitHub importer uses Sidekiq for importing multiple resources in parallel, though it also has the ability to import data sequentially should this be necessary. The new code is spread across the following directories: * lib/gitlab/github_import: this directory contains most of the importer code such as the classes used for importing resources. * app/workers/gitlab/github_import: this directory contains the Sidekiq workers, most of which simply use the code from the directory above. * app/workers/concerns/gitlab/github_import: this directory provides a few modules that are included in every GitHub importer worker. == Stages The import work is divided into separate stages, with each stage importing a specific set of data. Stages will schedule the work that needs to be performed, followed by scheduling a job for the "AdvanceStageWorker" worker. This worker will periodically check if all work is completed and schedule the next stage if this is the case. If work is not yet completed this worker will reschedule itself. Using this approach we don't have to block threads by calling `sleep()`, as doing so for large projects could block the thread from doing any work for many hours. == Retrying Work Workers will reschedule themselves whenever necessary. For example, hitting the GitHub API's rate limit will result in jobs rescheduling themselves. These jobs are not processed until the rate limit has been reset. == User Lookups Part of the importing process involves looking up user details in the GitHub API so we can map them to GitLab users. The old importer used an in-memory cache, but this obviously doesn't work when the work is spread across different threads. The new importer uses a Redis cache and makes sure we only perform API/database calls if absolutely necessary. Frequently used keys are refreshed, and lookup misses are also cached; removing the need for performing API/database calls if we know we don't have the data we're looking for. == Performance & Models The new importer in various places uses raw INSERT statements (as generated by `Gitlab::Database.bulk_insert`) instead of using Rails models. This allows us to bypass any validations and callbacks, drastically reducing the number of SQL queries and Gitaly RPC calls necessary to import projects. To ensure the code produces valid data the corresponding tests check if the produced rows are valid according to the model validation rules.
-
Yorick Peterse authored
The GitHub importer (and probably other parts of our code) ends up calling Feature.persisted? many times (via Gitaly). By storing this data in RequestStore we can save ourselves _a lot_ of database queries. Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/39361
-
Yorick Peterse authored
This adds the keyword argument "return_ids" to Gitlab::Database.bulk_insert. When set to `true` (and PostgreSQL is used) this method will return an Array of the IDs of the inserted rows, otherwise it will return an empty Array.
-
Yorick Peterse authored
By using SQL::Union we can return a proper ActiveRecord::Relation, making it possible to select the columns we're interested in (instead of all of them).
-
Douwe Maan authored
(EE-port) Free up reserved words that were used under the `groups` route See merge request gitlab-org/gitlab-ee!3299
-
Nick Thomas authored
Properly report errors when GeoNode fails to create Closes #3948 See merge request gitlab-org/gitlab-ee!3300
-
Bob Van Landuyt authored
-
Bob Van Landuyt authored
Free up `labels` as a group name Free up `avatar`, `group_members` and `milestones` as paths Free up some group reserved words Update failure message when finding new routes in `PathRegex` Check redirecting with a querystring Remove EE-specific group paths redirect the EE specific routes
-
Rémy Coutable authored
CE Upstream - Monday Closes gitlab-ce#39776, gitlab-ce#39771 et #3544 See merge request gitlab-org/gitlab-ee!3277
-
Stan Hu authored
Make BackgroundTransaction#labels public See merge request gitlab-org/gitlab-ce!15257
-
Stan Hu authored
Closes #3948
-
- 07 Nov, 2017 27 commits
-
-
Nick Thomas authored
Add API support and storage for GeoNode status in the database Closes #3867 and #3740 See merge request gitlab-org/gitlab-ee!3230
-
Stan Hu authored
-
Douwe Maan authored
Add post-migration to drain all Geo related redis queues Closes #3373 See merge request gitlab-org/gitlab-ee!3289
-
Dmitriy Zaporozhets authored
Signed-off-by: Dmitriy Zaporozhets <dmitriy.zaporozhets@gmail.com>
-
Toon Claes authored
Since gitlab-org/gitlab-ee!2644 is merged, some redis queues are no longer used. This post-migration drains these queues. Closes gitlab-org/gitlab-ee#3373.
-
Douwe Maan authored
Port of 27375-dashboard-activity-performance to EE See merge request gitlab-org/gitlab-ee!3287
-
Francisco Javier López authored
-
Filipa Lacerda authored
EE port of multi-file-editor-separate-commits-call See merge request gitlab-org/gitlab-ee!3285
-
Douwe Maan authored
Add maximum retry count migration and related logic Closes #2831 and gitlab-ce#33872 See merge request gitlab-org/gitlab-ee!3117
-
Nick Thomas authored
Resolve "Geo secondary help users not waste time on impossible operations." Closes #2524 See merge request gitlab-org/gitlab-ee!3260
-
Tim Zallmann authored
-
Grzegorz Bizon authored
Add all InfluxDB metrics to Prometheus Closes gitlab-ce#33643 See merge request gitlab-org/gitlab-ee!3270
-
Achilleas Pipinellis authored
Enhance the documentation for gitlab-ctl replicate-geo-database Closes #3877 See merge request gitlab-org/gitlab-ee!3268
-
Dmitriy Zaporozhets authored
Signed-off-by: Dmitriy Zaporozhets <dmitriy.zaporozhets@gmail.com>
-
Tiago Botelho authored
Removes hard_failed state out of project state machine. Now when the mirror reaches the retry count limit it will stay as failed but will not get scheduled anymore, and a new message will appear
-
Douwe Maan authored
Automatically add EE paths to eager loaded paths based on the CE paths See merge request gitlab-org/gitlab-ee!3166
-
Rémy Coutable authored
Add changes_count to the merge requests API See merge request gitlab-org/gitlab-ee!3280
-
Winnie Hellmann authored
-
Achilleas Pipinellis authored
CI disposable environments docs See merge request gitlab-org/gitlab-ee!3186
-
Pawel Chojnacki authored
- Fix Geo tests accessing private methods to test if metrics were collected - Readd add_event_with_values method in influx metrics - Fix metrics_update_service_spec by allowing access to private method - stop using .send to access private method in tests
-
Sean McGivern authored
Resolve "Saved configuration for issue board" Closes #2518, #3792, #2093, and #2924 See merge request gitlab-org/gitlab-ee!2912
-
Achilleas Pipinellis authored
Guide users to the right version of the Geo documentation for their installation Closes #3898 See merge request gitlab-org/gitlab-ee!3283
-
Rémy Coutable authored
Signed-off-by: Rémy Coutable <remy@rymai.me>
-
Douwe Maan authored
Add support for SAML required_groups Closes #3143 See merge request gitlab-org/gitlab-ee!3223
-
Phil Hughes authored
Replace all instances of border: none to border: 0 for scss-lint See merge request gitlab-org/gitlab-ee!3288
-
Simon Knox authored
-
Tim Zallmann authored
Add metrics for monitoring canary tracks for CPU and Memory See merge request gitlab-org/gitlab-ee!3292
-
- 06 Nov, 2017 1 commit
-
-
Jacob Schatz authored
Prefer template tag instead of extra span. Closes #3916 See merge request gitlab-org/gitlab-ee!3291
-