i.e. i have a file called Porträt.pdf.
But the filename was created with a charsets which isn't properly shown in ubuntu like the following example.
What would be the best practice to rename such chars in filenames, when you have several filenames and you can't use this special char because of it's coding in terminal commands?
In theory it can be tricky to know the character encoding used by the files, but in most cases the error comes from windows systems and programs still using just Latin1 instead of UTF-8. Run
convmv -f cp850 -t utf-8 *
without quotes in the folder with the broken files and have a try.(You need
convmv
package installed)If you just want to get rid of some characters, you could try this:
That would replace every character that is not just char, number or dash with an underscore. Run with the
-n
option to see what is happening in a dry-run.I guess that modern OSes often chooses UTF-8 for encoding file names. In this sense it's not a problem to have non-US characters in the file names. What you have experienced is probably the result of a file name which was created with non-UTF-8 encoding. It's quite hard to tell what you can do with that, it's also depends what you'd like. If you need the correct file name (for example "Porträt.pdf") you need to know the encoding of the original file name first, then you can convert it/them. It's not so easy to "guess" only since there are huge amount of very different encodings.