I've several problems maintaining large production servers, in which some developers drop files from Windows environments, sometime with BOM-bytes (We use UTF8, and no need for that), causing lots of troubles.
Other times, I got a "no end of line" and "[DOS]" labels when vim-editing files directly on the server.
I recently discovered how to find for the bom byte, and how to delete it in a batch script. What about illegal bytes, bad EOLs? Is it safe to use DOS Text Files on a linux environment? Any drawbacks If I use to convert them with dos2unix cmd ?
Regards
Yeh, BOM-bytes are bad. The locale should determine the encoding of a file.
The other thing as you've rightly pointed out is line endings. Dos tends to be CRLF and Linux is LF only.
dos2unix will take care of this problem for you.
"Bad EOL" (
no end of line
message) isn't bad. It just notifies you that there is no EOL after the last line. The Unix convention is to use EOL as a line terminator, and most Windows tools consider it a separator.Other than the message (and slight annoyance when
cat
ing such a file), there is nothing bad in it.DOS/Windows line endings (CR/LF) can cause some problems, especially in scripts: when Linux is reading the
#!
line, it will use everything up to the first LF, and will consider the CR part of interpreter filename.For executable scripts it is best to use Unix line endings (
:set ff=unix
), otherwise Linux would attempt to execute/usr/bin/perl
<CR>
when you had#!/usr/bin/perl
along with Windows line endings.For other files, it doesn't matter much.
The UTF-8 signature (EF BB BF) can cause even more problems - disable with
:set nobomb
, mass-remove withsed -i 's/^\xef\xbb\xbf//'
.EOL: End-of-line character or characters; either LF or CR/LF, whichever is apropriate.