SQL Server 生成的“文本文件”使用复制命令将数据导入到postgressql
我有一个使用 SQL Server 2005 生成的文本文件。当我使用“复制”将该文本文件导入到我的 PostgreSQL 表之一时,出现以下错误:
错误:编码“UTF8”的字节序列无效:0xff
谁能告诉我需要做什么才能将数据从 SQL Server 2005 获取到 PostgreSQL?
I have one text file generated using SQL Server 2005. While I am importing the text file into one of my PostgreSQL table using "copy" it is giving me the following error:
ERROR: invalid byte sequence for encoding "UTF8": 0xff
Can any one tell me what i need to do get the data from SQL Server 2005 to PostgreSQL?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我遇到了完全相同的情况(除了 SQL Server 2008 而不是 2005)。当我使用“SQL Server 导入和导出向导”导出文件并选择平面文件作为目标时,“代码页”参数默认为“1252 (ANSI - Latin I)”。
因此,当在 postgreSQL 中运行复制命令时,我使用
set client_encoding to 'LATIN1';
- 并且执行成功。复制表名 FROM 'path/to/file.csv' DELIMITERS '|' CSV;
I had exactly the same situation (except for SQL Server 2008 instead of 2005). When I was exporting the file using 'SQL Server Import and Export Wizard' and picked Flat File as Destination, the 'Code page' parameter defaulted to '1252 (ANSI - Latin I)'.
Thus, when running copy command in postgreSQL I used
set client_encoding to 'LATIN1';
- and it executed successfully.copy tablename FROM 'path/to/file.csv' DELIMITERS '|' CSV;
如果文件确实采用 Windows-1252 编码,那么您可以在运行复制命令之前使用
set client_encoding=windows_1252
切换编码。查看手册以获取可用编码的列表:
http://www.postgresql.org/docs/9.0/static/multibyte.html
If the file is indeeded in Windows-1252 encoding then you can switch the encoding by using
set client_encoding=windows_1252
before running the copy command.Check out the manual for a list of available encodings:
http://www.postgresql.org/docs/9.0/static/multibyte.html
SQL Server 文本输出(BCP 文件?)过去被编写为 UTF-16,这是 PostgreSQL 不支持的 Unicode 编码。 UTF-16 文件以 0xff 0xfe(或 0xfe 0xff)开头,因此这将是首先收到有关该特定字节值的投诉的原因之一。
在 Linux 或类似系统上,我建议使用“recode”或“iconv”实用程序将 UTF-16 转换为 UTF-8,这是 PostgreSQL 的首选全 Unicode 编码。在 Windows 上执行该任务的一个建议是获取 Windows 版本的重新编码:UTF-16 到 UTF-8 转换(用于 Windows 中的脚本)
SQL Server text output (BCP files?) have in the past been written as UTF-16, which is a Unicode encoding that PostgreSQL doesn't support. UTF-16 files start with 0xff 0xfe (or 0xfe 0xff) so that would be one reason for getting a complaint about that particular byte value first.
On Linux or similar I'd suggest using the "recode" or "iconv" utilities to convert from UTF-16 to UTF-8, which is PostgreSQL's preferred all-Unicode encoding. One recommendation for performing that task on Windows is just to get a Windows version of recode: UTF-16 to UTF-8 conversion (for scripting in Windows)