关于 mysqlimport、换行符和字符编码(我认为)
首先,需要说明的是:我不是 Linux 管理员,我的 Linux 管理员也不是程序员。
也就是说,我们有一个 cronjob,它运行 mysqlimport 命令来导入每天生成的文本文件。 我无法控制或参与此文件的创建方式。 通过反复试验,我们发现文本文件是在 Windows 计算机上生成的,因此对于 lines-termerated-by
参数,我们必须指定 \r\n
。 上周它停止正常工作,我们确定这是因为该文件现在是在 Linux 中生成的,所以我们将其更改为 \n
。 我的理解(并不完全清楚)是,这取决于谁生成决定使用什么平台和编码的文本文件。
我们有一个执行 mysqlimport 命令的 shell 脚本。 当我们提供正确的编码时,一切都会完美运行。 但是,由于我们不知道谁将首先创建文本文件,因此有没有办法确定编码是什么并实现正确的换行符? (“编码”在这里是正确的术语吗?)
First, some disclosure: I am not a Linux admin, and my Linux admin is not a programmer.
That said, we have a cronjob that runs a mysqlimport
command to import a text file that's generated daily. I have no control or involvement in how this file is created. Through trial and error, we discovered that the text file is generated on a Windows machine, so for the lines-terminated-by
argument we had to specify \r\n
. Last week it stopped working correctly, and we determined it was because the file was now being generated in Linux, so we changed it to just \n
. My understanding (which isn't entirely clear) is that it depends on who generates the text file that determines what platform and encoding is used.
We have a shell script that's executing the mysqlimport
command. When we provide the correct encoding, everything works perfectly. But since we don't know who's going to create the text file in the first place, is there a way to determine what the encoding is and implement the proper line-break character(s)? (And is "encoding" the proper term here?)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我认为你可以在 Linux 上使用 dos2unix 或 unix2dos 命令在“Windows”和 Linux 编码之间进行转换。 因此,您不必检测使用的编码,只需运行命令即可确保文件具有正确的编码。
这取决于实用程序的版本,但通常命令会像这样执行:
I think you can use dos2unix or unix2dos commands on linux to convert between 'Windows' and Linux encoding. So you don't have to detect what encoding is used, just run the command to make sure the file has correct encoding.
It depends on the version of the utility but generally the command would be executed like this:
在调用 mysqlimport 之前,您需要对 mysqlimport 正在读取的文件运行 dos2unix 命令。 如果提供的文件是在 *nix 系统上生成的,dos2unix 会将 Windows 行结尾“\r\n”转换为 *nix 行结尾“\n”,该文件保持不变。
这样,您就可以确保将始终将相同格式的行结尾输入到 shell 脚本中。
此致
You need to run the dos2unix command on the file that mysqlimport is reading, prior to calling mysqlimport. dos2unix will convert Windows line endings "\r\n" to *nix line endings "\n" if the provided file was generated on *nix system the file remains untouched.
That way you are making sure that you will always have the same format of line ending being fed to your shell script.
Best Regards
fromdos
、dos2unix
和tofrodos
是有时安装在 Linux 系统上的三个程序。 您可以使用其中之一来始终将格式转换为 unix 行结束符 (\n
)。fromdos
,dos2unix
, andtofrodos
are three programs sometimes installed on Linux systems. You could use one of these to always convert the format to unix line ends (\n
).可能最简单的方法是使用 tr 删除多余的 \r运行 mysqlimport 之前的 Windows 行结尾。
Probably easiest to use tr to delete the extra \r from Windows line endings before running mysqlimport.