It seems like SQL Server 2008 does not support code page 65001 when using Bulk Insert. If I use other code pages, my data is corrupted when imported. Is there a way to insert my UTF8 data properly in SQL Server 2008 using scripts?
It seems like SQL Server 2008 does not support code page 65001 when using Bulk Insert. If I use other code pages, my data is corrupted when imported. Is there a way to insert my UTF8 data properly in SQL Server 2008 using scripts?
SQL Server supports unicode, but like Java, it only uses UTF-16 Little Endian (also called UCS-2) for unicode data in NCHAR, NVARCHAR, and NTEXT fields. I assume you're talking about the BCP utility, which supports only UCS-2 data on import (it will not convert UTF-8).
Other SQL server tools may support on-the-fly conversion of UTF-8 data (for example, SQL Server Integration Services (SSIS)). However, you may just be better off preprocessing your files with an open source command-line tool to do UTF-8 to UTF-16 Little Endian and then using BCP if that is your tool preference.