如何有效地搜索/替换大型txt文件?
我有一个相对较大的 csv/文本数据文件 (33mb),我需要对其进行全局搜索并替换定界字符。 (原因是似乎没有办法让 SQLServer 在表导出期间转义/处理数据中的双引号,但那是另一个故事......)
我成功完成了 Textmate 搜索并替换为较小的文件,但是这个更大的文件令人窒息。
看起来命令行 grep 可能是答案,但我不太掌握语法,唉:
grep -rl OLDSTRING . | xargs perl -pi~ -e ‘s/OLDSTRING/NEWSTRING/’
所以在我的例子中,我正在搜索 '^' (插入符号)字符并替换为 '"' (双引号) )
grep -rl " grep_test.txt | xargs perl -pi~ -e 's/"/^'
这不起作用,我假设它与双引号或其他东西的转义有关,但我很迷茫。
(我想如果有人知道如何让 SQLServer2005 处理的 话)。导出到 csv 期间文本列中的双引号,这确实解决了核心问题。)
I have a relatively large csv/text data file (33mb) that I need to do a global search and replace the delimiting character on. (The reason is that there doesn't seem to be a way to get SQLServer to escape/handle double quotes in the data during a table export, but that's another story...)
I successfully accomplished a Textmate search and replace on a smaller file, but it's choking on this larger file.
It seems like command line grep may be the answer, but I can't quite grasp the syntax, ala:
grep -rl OLDSTRING . | xargs perl -pi~ -e ‘s/OLDSTRING/NEWSTRING/’
So in my case I'm searching for the '^' (caret) character and replacing with '"' (double-quote).
grep -rl " grep_test.txt | xargs perl -pi~ -e 's/"/^'
That doesn't work and I'm assuming it has to do with the escaping of the doublequote or something, but I'm pretty lost. Help anyone?
(I suppose if anyone knows how to get SQLServer2005 to handle double quotes in a text column during export to csv, that'd really solve the core issue.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
你的 perl 替换似乎是错误的。尝试:
说明:
Your perl substitution seems to be wrong. Try:
Explanation:
更新:您也可以按照 Rein 的建议使用 Perl
但在大文件上,sed 可能比 Perl 运行得快一点,正如我的结果在 600 万行文件上显示的那样
Update: you can also use Perl as rein has suggested
But on big files, sed may run a bit faster than Perl, as my result shows on a 6million line file