• Lin Jen-Shin's avatar
    Raise encoding confidence threshold to 50 · 8e2350ae
    Lin Jen-Shin authored
    It is recommended that we set this to 50:
    https://gitlab.com/gitlab-org/gitlab-ce/issues/35098#note_35036746
    
    In this particular issue, the confidence was 42 for Shift JIS,
    but in fact that's encoded in UTF-8 just with a single bad
    character. In this case, we shouldn't try to treat it as Shift JIS,
    but just treat it as UTF-8 and remove invalid bytes.
    
    Treating it like Shift JIS would corrupt the whole data.
    
    Unfortunately, the diff which would cause this could not be
    disclosed therefore we can't use it as a test example.
    8e2350ae
encoding_helper.rb 2.15 KB