管理脚本内文件更改的最佳实践
我有一个 BASH 脚本,它对文件执行许多操作,例如:
cp input.txt file.tmp1
sed (code) file.tmp1 > file.tmp2
sed (code) file.tmp2 > file.tmp3
sed (code) file.tmp3 > file.tmp4
sed (code) file.tmp4 > file.tmp5
sed (code) file.tmp5 > file.tmp6
sed (code) file.tmp6 > file.tmp7
cp output.txt
这样:
- 原始文件不变。
- 我可以检查每个阶段的文件更改,只是为了确保我的代码没有做任何错误。
然而,这似乎不是一种非常理想的处理文件的方式。
- 有更好的方法吗?
- 是否有任何工具可以帮助检查更改,看看是否引入了任何异常情况?
I have a BASH script which performs many actions on a file, for e.g.:
cp input.txt file.tmp1
sed (code) file.tmp1 > file.tmp2
sed (code) file.tmp2 > file.tmp3
sed (code) file.tmp3 > file.tmp4
sed (code) file.tmp4 > file.tmp5
sed (code) file.tmp5 > file.tmp6
sed (code) file.tmp6 > file.tmp7
cp output.txt
In this way:
- The original file is unchanged.
- I can check the files changes at each stage, just to make sure my code did not do anything wrong.
However, this seems a not very ideal way to handle the files.
- Is there a better way to do this?
- Is there any tool which can help inspect the changes, just to see if anything unusual was introduced?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用临时文件是一个好主意,但您应该使用
mktemp(1)
来安全地创建临时文件。虽然使用多个文件进行多次传递并没有什么问题,但请考虑使用 mktemp -d 为所有文件创建临时目录,以确保永远不会覆盖用户关心的任何内容。
但是,如果您永远不会查看中间文件,则可以像这样处理多个传递:
如果一个失败,它们都会失败,这可以使错误处理更容易。完成后无需删除任何临时文件。
如果您想检查管道是否有错误,
tee
将为您提供帮助。它将所有输入重定向到其标准输出和管道,用法如下:您可以使用
diff -u input.txt output.txt
检查更改。diff(1)
是一个逐行差异程序,-u
统一输出非常容易阅读。wdiff(1)
是一个逐字差异程序,对于某些情况可能更有用。xxdiff(1)
是一个出色的 GUI 界面,用于检查两个文件之间的差异 - 它会尽力向您显示单独更改的字符。 (它对于处理 CVS 和 SVN 风格的冲突文件也非常有用,但这完全是另一回事。)Working on a temporary file is a fine idea, but you should use
mktemp(1)
to make your temporary file safely.While there's nothing wrong with using multiple files for multiple passes, consider using
mktemp -d
to create a temporary directory for all your files to ensure you never overwrite anything the user cares about.But if you're never going to look at the intermediate files, multiple passes can be handled like this:
If one fails, they all fail, which can make for easier error handling. There's no temporary files to remove when you're finished.
If you like to inspect the pipelines for errors,
tee
will help you. It redirects all input both to its standard output and a pipe, used like:You can inspect the changes by using
diff -u input.txt output.txt
.diff(1)
is a line-wise differences program, and the-u
unified output is pretty easy to read.wdiff(1)
is a word-wise differences program, which might be more useful for some cases.And
xxdiff(1)
is a superb GUI interface for inspecting the differences between two files -- it will go to some effort to show you individually changed characters. (It is also fantastic for handling CVS- and SVN-style conflict files, but that's another matter completely.)更有效的方法是使用管道。例如:
问题是你无法检查不同阶段的变化。
A more effective way would be to use pipes. E.g.:
The problem is that you can not check the changes of the different stages.