Commit 8e2350ae authored by Lin Jen-Shin's avatar Lin Jen-Shin

Raise encoding confidence threshold to 50

It is recommended that we set this to 50:
https://gitlab.com/gitlab-org/gitlab-ce/issues/35098#note_35036746

In this particular issue, the confidence was 42 for Shift JIS,
but in fact that's encoded in UTF-8 just with a single bad
character. In this case, we shouldn't try to treat it as Shift JIS,
but just treat it as UTF-8 and remove invalid bytes.

Treating it like Shift JIS would corrupt the whole data.

Unfortunately, the diff which would cause this could not be
disclosed therefore we can't use it as a test example.
parent feb8974c
......@@ -11,7 +11,7 @@ module Gitlab
# obscure encoding with low confidence.
# There is a lot more info with this merge request:
# https://gitlab.com/gitlab-org/gitlab_git/merge_requests/77#note_4754193
ENCODING_CONFIDENCE_THRESHOLD = 40
ENCODING_CONFIDENCE_THRESHOLD = 50
def encode!(message)
return nil unless message.respond_to? :force_encoding
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment