Commit 46a0f612 authored by Eduardo Bonet's avatar Eduardo Bonet

Adding Special diff rendering for .ipynb notebooks

Changes are wrapped by feature flag jupyter_clean_diff.

On the commit folder, when detecting a ipynb file, creates a new diff
of those files after converting them to markdown, using the [ipynbdiff
gem](https://gitlab.com/gitlab-org/incubation-engineering/mlops/rb-ipynbdiff)

There are performance and architectural concerns on whether the rails
app is the right place for this type of diffing, but since we currently
don't have a good solution we are discussing possible alternatives for
the future here: https://gitlab.com/gitlab-org/gitlab/-/issues/342143.

Changelog: added
MR: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/71477
Epic: https://gitlab.com/groups/gitlab-org/-/epics/6589
parent 9573d754
...@@ -534,3 +534,5 @@ gem 'webauthn', '~> 2.3' ...@@ -534,3 +534,5 @@ gem 'webauthn', '~> 2.3'
gem 'ipaddress', '~> 0.8.3' gem 'ipaddress', '~> 0.8.3'
gem 'parslet', '~> 1.8' gem 'parslet', '~> 1.8'
gem 'ipynbdiff', '0.3.6'
...@@ -642,6 +642,9 @@ GEM ...@@ -642,6 +642,9 @@ GEM
invisible_captcha (1.1.0) invisible_captcha (1.1.0)
rails (>= 4.2) rails (>= 4.2)
ipaddress (0.8.3) ipaddress (0.8.3)
ipynbdiff (0.3.6)
diffy (= 3.3.0)
json (= 2.5.1)
jaeger-client (1.1.0) jaeger-client (1.1.0)
opentracing (~> 0.3) opentracing (~> 0.3)
thrift thrift
...@@ -1501,6 +1504,7 @@ DEPENDENCIES ...@@ -1501,6 +1504,7 @@ DEPENDENCIES
icalendar icalendar
invisible_captcha (~> 1.1.0) invisible_captcha (~> 1.1.0)
ipaddress (~> 0.8.3) ipaddress (~> 0.8.3)
ipynbdiff (= 0.3.6)
jira-ruby (~> 2.1.4) jira-ruby (~> 2.1.4)
js_regex (~> 3.7) js_regex (~> 3.7)
json (~> 2.5.1) json (~> 2.5.1)
......
# frozen_string_literal: true # frozen_string_literal: true
require 'ipynbdiff'
class BlobPresenter < Gitlab::View::Presenter::Delegated class BlobPresenter < Gitlab::View::Presenter::Delegated
include ApplicationHelper include ApplicationHelper
...@@ -20,6 +21,17 @@ class BlobPresenter < Gitlab::View::Presenter::Delegated ...@@ -20,6 +21,17 @@ class BlobPresenter < Gitlab::View::Presenter::Delegated
) )
end end
def highlight_transformed(plain: nil)
load_all_blob_data
Gitlab::Highlight.highlight(
blob.path,
transformed_blob_data,
language: language,
plain: plain
)
end
def plain_data def plain_data
return if blob.binary? return if blob.binary?
...@@ -107,4 +119,8 @@ class BlobPresenter < Gitlab::View::Presenter::Delegated ...@@ -107,4 +119,8 @@ class BlobPresenter < Gitlab::View::Presenter::Delegated
def language def language
blob.language_from_gitattributes blob.language_from_gitattributes
end end
def transformed_blob_data
@transformed_blob ||= ( blob.path.ends_with?('.ipynb') && IpynbDiff.transform(blob.data, options: { include_metadata: false, cell_decorator: :percent }) ) || blob.data
end
end end
---
name: jupyter_clean_diffs
introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/71477
rollout_issue_url: https://gitlab.com/gitlab-org/gitlab/-/issues/343433
milestone: '14.5'
type: development
group: group::incubation
default_enabled: false
...@@ -20,6 +20,15 @@ rendered to HTML when viewed: ...@@ -20,6 +20,15 @@ rendered to HTML when viewed:
Interactive features, including JavaScript plots, don't work when viewed in Interactive features, including JavaScript plots, don't work when viewed in
GitLab. GitLab.
## Cleaner diffs
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/epics/6589) in GitLab 14.5
When commits include changes to Jupyter Notebook files, GitLab strips out the
noise and displays a cleaner version of the diff.
![Jupyter Notebook Clean Diff](img/jupyter_notebook_diff.png)
## Jupyter Git integration ## Jupyter Git integration
Jupyter can be configured as an OAuth application with repository access, acting Jupyter can be configured as an OAuth application with repository access, acting
......
...@@ -43,6 +43,8 @@ module Gitlab ...@@ -43,6 +43,8 @@ module Gitlab
# Ensure items are collected in the the batch # Ensure items are collected in the the batch
new_blob_lazy new_blob_lazy
old_blob_lazy old_blob_lazy
preprocess_before_diff(diff) if Feature.enabled?(:jupyter_clean_diffs, @project)
end end
def position(position_marker, position_type: :text) def position(position_marker, position_type: :text)
...@@ -448,6 +450,19 @@ module Gitlab ...@@ -448,6 +450,19 @@ module Gitlab
find_renderable_viewer_class(classes) find_renderable_viewer_class(classes)
end end
def preprocess_before_diff(diff)
return unless diff.new_path.ends_with? '.ipynb'
from = old_blob_lazy&.data
to = new_blob_lazy&.data
new_diff = IpynbDiff.diff(from, to,
diff_opts: { context: 5, include_diff_info: true },
transform_options: { cell_decorator: :percent } )
diff.diff = new_diff.scan(/.*\n/)[2..-1].join('') if new_diff
end
def alternate_viewer_class def alternate_viewer_class
return unless viewer.instance_of?(DiffViewer::Renamed) return unless viewer.instance_of?(DiffViewer::Renamed)
......
...@@ -152,6 +152,9 @@ module Gitlab ...@@ -152,6 +152,9 @@ module Gitlab
return [] unless blob return [] unless blob
blob.load_all_data! blob.load_all_data!
return blob.present.highlight_transformed.lines if Feature.enabled?(:jupyter_clean_diffs, @project)
blob.present.highlight.lines blob.present.highlight.lines
end end
......
...@@ -53,7 +53,7 @@ module TestEnv ...@@ -53,7 +53,7 @@ module TestEnv
'wip' => 'b9238ee', 'wip' => 'b9238ee',
'csv' => '3dd0896', 'csv' => '3dd0896',
'v1.1.0' => 'b83d6e3', 'v1.1.0' => 'b83d6e3',
'add-ipython-files' => 'f6b7a70', 'add-ipython-files' => 'c10c411',
'add-pdf-file' => 'e774ebd', 'add-pdf-file' => 'e774ebd',
'squash-large-files' => '54cec52', 'squash-large-files' => '54cec52',
'add-pdf-text-binary' => '79faa7b', 'add-pdf-text-binary' => '79faa7b',
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment