There are two cases, which i confused about:
- If newly created file contains only latin symbols, then
file -i
will showus-ascii
. - If newly created file contains only latin and cyrillic symbols, then
file -i
will showutf-8
I tested this behavior with several tools for creating files within local copy of git repository: intellij idea, nano, echo etc.
However, when i push this files to remote repository, participants on Windows OS determines this files as UTF-8.
So, since there is no BOM generated during file creation - there is no way to distinguish ASCII and UTF-8. So in terms of correct
prediction
of file encoding - it is better to answer ASCII, than UTF-8 (if both contains only latin characters), since UTF-8 involves more char-codes.Therefore
file -i
doing all the best.Thanks fedonkadifeli for the help.