| ENCODING_CONFIDENCE_THRESHOLD |
= |
40 |
|
This threshold is carefully tweaked to prevent usage of encodings detected
by CharlockHolmes with low confidence. If CharlockHolmes confidence is low,
we‘re better off sticking with utf8 encoding. Reason: git diff can
return strings with invalid utf8 byte sequences if it truncates a diff in
the middle of a multibyte character. In this case CharlockHolmes will try
to guess the encoding and will likely suggest an obscure encoding with low
confidence. There is a lot more info with this merge request: gitlab.com/gitlab-org/gitlab_git/merge_requests/77#note_4754193
|