This is related to https://stackoverflow.com/questions/1791082/utf-8-php-and-xml-mysql, which I am still trying to get my head around.
I have a couple of separate questions that will hopefully help me understand how to resolve the issues I am having.
I am trying to read values from a database and output into a file in UTF-8 format. But I am having encoding issues, so I thought I would strip back all my code and start with:
$string = "Otivägen";
// then output to a file.
But in vim
I can’t even enter the that string; every time I paste it in I get Otivägen
.
I tried to create a blank PHP file with only that string and upload it, but when I cat
the file again I get Otivägen
.
My questions are:
- Why is
vim
displaying it like this? - If the file is downloaded, would it display correctly if an application was expecting UTF-8?
- How can I output this string into a file that will eventually be an XML file in UTF-8 encoding?
My understanding of encoding is limited at the moment, and I am trying to understand it.
This looks like vim is displaying UTF-8-encoded data as ISO 8859-1. Copy&Paste can be problematic (you don't write what system you are on), so I'd advise to type in the text directly.
To properly edit the file in vim, first set vim to use UTF-8:
Then type in the text, make sure it's correctly displayed, and save. That will give you a file encoded in UTF-8.
Depends on the encoding. If you save it as above, then yes.
That is apparently very difficult. I'm not that familiar with PHP, but according to Wikipedia:
So you'll probably have to google for a workaround. There are also a few UTF-8 helper libraries for PHP & UTF-8. Otherwise it might be better to choose a different language, e.g. Java which has solid Unicode support.
UTF8 is fun. Once it works. :-/ If anything in the chain is expecting something else and doesn't check, then it all goes pear-shaped.
:set encoding
I just tried it and that made everything work.
The key is that everything needs to be in UTF8 mode.