在 bash 中逐行读取>>写入文件时保留前导空格

发布于 2024-08-08 23:24:04 字数 451 浏览 12 评论 0原文

我正在尝试循环遍历文本文件目录并将它们组合成一个文档。这很好用，但是文本文件包含代码片段，并且我的所有格式都折叠到左侧。一行上的所有前导空格都被删除。

#!/bin/sh
OUTPUT="../best_practices.textile"
FILES="../best-practices/*.textile"
for f in "$FILES"
do
  echo "Processing $f file..."
  echo "">$OUTPUT

  cat $f | while read line; do 
      echo "$line">>$OUTPUT
  done
  echo >>$OUTPUT
  echo >>$OUTPUT
done

诚然，我是一个 bash 菜鸟，但经过一番搜索后，我找不到合适的解决方案。显然 BASH 总体上讨厌前导空格。

原文

I am trying to loop through a directory of text files and combine them into one document. This works great, but the text files contain code snippets, and all of my formatting is getting collapsed to the left. All leading whitespace on a line is stripped.

#!/bin/sh
OUTPUT="../best_practices.textile"
FILES="../best-practices/*.textile"
for f in "$FILES"
do
  echo "Processing $f file..."
  echo "">$OUTPUT

  cat $f | while read line; do 
      echo "$line">>$OUTPUT
  done
  echo >>$OUTPUT
  echo >>$OUTPUT
done

I am admittedly a bash noob, but after searching high and low I couldn't find a proper solution. Apparently BASH hates the leading white space in general.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

与往事干杯 2024-08-15 23:24:04

正如其他人指出的那样，使用 cat 或 awk 而不是 read-echo 循环是一种更好的方法——避免空白修剪问题（以及其他一些您没有偶然发现的问题），运行速度更快，至少对于 cat 来说，代码更简洁。尽管如此，我还是想尝试一下让读取回显循环正常工作。

首先，空白修剪问题：read命令自动修剪前导和尾随空白；这可以通过将 IFS 变量设置为空白来更改其空白定义来解决。另外，read 假设行尾的反斜杠意味着下一行是延续，并且应该与这一行拼接在一起；要解决此问题，请使用其 -r（原始）标志。这里的第三个问题是 echo 的许多实现解释字符串中的转义序列（例如，它们可能将 \n 转换为实际的换行符）；要解决此问题，请改用 printf。最后，正如一般的脚本卫生规则一样，当您实际上不需要时，不应该使用 cat ；请改用输入重定向。通过这些更改，内部循环如下所示：

while IFS='' read -r line; do 
  printf "%s\n" "$line">>$OUTPUT
done <$f

...周围的脚本还存在一些其他问题：尝试将 FILES 定义为可用 .textile 文件列表的行有引号，这意味着它永远不会被扩展为实际的文件列表。最好的方法是使用一个数组：（

FILES=(../best-practices/*.textile)
...
for f in "${FILES[@]}"

并且所有出现的 $f 都应该用双引号引起来，以防任何文件名中包含空格或其他有趣的字符 - 也应该使用 $OUTPUT 来做到这一点，尽管因为它是在脚本中定义的，所以实际上可以安全地离开。）

最后，在循环文件顶部附近有一个 echo "">$OUTPUT ，它将删除输出每次都文件（即最后，它只包含最后一个 .textile 文件）；这需要移到循环之前。我不确定这里的意图是在文件开头放置一个空行，还是在文件之间放置三个空行（一个在开头，两个在结尾），所以我不确定到底是什么适当的替代品是。无论如何，解决所有这些问题后我可以解决以下问题：

#!/bin/sh
OUTPUT="../best_practices.textile"
FILES=(../best-practices/*.textile)

: >"$OUTPUT"
for f in "${FILES[@]}"
do
  echo "Processing $f file..."
  echo >>"$OUTPUT"

  while IFS='' read -r line; do 
    printf "%s\n" "$line">>"$OUTPUT"
  done <"$f"

  echo >>"$OUTPUT"
  echo >>"$OUTPUT"
done

As others have pointed out, using cat or awk instead of a read-echo loop is a much better way to do this -- avoids the whitespace-trimming problem (and a couple of others you haven't stumbled upon), runs faster, and at least with cat, is simply cleaner code. Nonetheless, I'd like to take a stab at getting the read-echo loop to work right.

First, the whitespace-trimming problem: the read command automatically trims leading and trailing whitespace; this can be fixed by changing its definition of whitespace by setting the IFS variable to blank. Also, read assumes that a backslash at the end of the line means the next line is a continuation, and should be spliced together with this one; to fix this, use its -r (raw) flag. The third problem here is that many implementations of echo interpret escape sequences in the string (e.g. they may turn \n into an actual newline); to fix this, use printf instead. Finally, just as a general scripting hygiene rule, you shouldn't use cat when you don't actually need to; use input redirection instead. With those changes, the inner loop looks like this:

while IFS='' read -r line; do 
  printf "%s\n" "$line">>$OUTPUT
done <$f

...there are also a couple of other problems with the surrounding script: the line that tries to define FILES as the list of available .textile files has quotes around it, meaning it never gets expanded into an actual list of files. The best way to do this is to use an array:

FILES=(../best-practices/*.textile)
...
for f in "${FILES[@]}"

(and all occurrences of $f should be in double-quotes in case any of the filenames have spaces or other funny characters in them -- should really do this with $OUTPUT as well, though since that's defined in the script it's actually safe to leave off.)

Finally, there's a echo "">$OUTPUT near the top of the loop-over-files that's going to erase the output file every time through (i.e. at the end, it only contains the last .textile file); this needs to be moved to before the loop. I'm not sure if the intent here was to put a single blank line at the beginning of the file, or three blank lines between files (and one at the beginning and two at the end), so I'm not sure exactly what the appropriate replacement is. Anyway, here's what I can up with after fixing all of these problems:

#!/bin/sh
OUTPUT="../best_practices.textile"
FILES=(../best-practices/*.textile)

: >"$OUTPUT"
for f in "${FILES[@]}"
do
  echo "Processing $f file..."
  echo >>"$OUTPUT"

  while IFS='' read -r line; do 
    printf "%s\n" "$line">>"$OUTPUT"
  done <"$f"

  echo >>"$OUTPUT"
  echo >>"$OUTPUT"
done

回复收藏 0 原文

愛放△進行李 2024-08-15 23:24:04

而不是：

cat $f | while read line; do 
    echo "$line">>$OUTPUT
done

这样做：（

cat $f >>$OUTPUT

如果有原因您需要逐行执行操作，最好将其包含在问题中。）

Instead of:

cat $f | while read line; do 
    echo "$line">>$OUTPUT
done

Do this:

cat $f >>$OUTPUT

(If there's a reason you need to do things line by line it'd be good to include that in the question.)

回复收藏 0 原文

彩扇题诗 2024-08-15 23:24:04

这是一种过于昂贵的组合文件方式。

cat ../best-practices/*.textile >  ../best_practices.textile

如果您想在连接时向每个文件添加空白（换行符），请使用 awk

awk 'FNR==1{print "">"out.txt"}{print > "out.txt" }' *.textile

或

awk 'FNR==1{print ""}{print}' file* > out.txt

that's an overly expensive way of combining files.

cat ../best-practices/*.textile >  ../best_practices.textile

if you want to add a blank( newline) to each file as you concatenate, use awk

awk 'FNR==1{print "">"out.txt"}{print > "out.txt" }' *.textile

awk 'FNR==1{print ""}{print}' file* > out.txt

回复收藏 0 原文

唠甜嗑 2024-08-15 23:24:04

这允许您在每个输入文件之间散布换行符，就像您在原始脚本中所做的那样：

for f in $FILES; do echo -ne '\n\n' | cat "$f" -; done > $OUTPUT

请注意，为了使其工作，$FILES 未加引号（否则额外的换行符仅在所有输入文件的末尾出现一次）输出），但必须用引号括起来 $f 以保护文件名中的空格（如果存在）。

This allows you to intersperse newlines between each input file as you have done in your original script:

for f in $FILES; do echo -ne '\n\n' | cat "$f" -; done > $OUTPUT

Note that $FILES is unquoted for this to work (otherwise the extra newlines appear only once at the end of all the output), but $f must be quoted to protect spaces in filenames, if they exist.

回复收藏 0 原文

十六岁半 2024-08-15 23:24:04

在我看来，正确的答案是这个，复制如下：

while IFS= read line; do
    check=${line:0:1}
done < file.txt

请注意，它会处理通过管道传输输入的情况来自另一个命令，而不仅仅是来自实际文件。

请注意，您还可以简化重定向，如下所示。

#!/bin/bash
OUTPUT="../best_practices.textile"
FILES="../best-practices/*.textile"
for f in "$FILES"
do
  echo "Processing $f file..."
  {
  echo

  while IFS= read line; do 
      echo "$line"
  done < $f
  echo
  echo;
  } > $OUTPUT
done

The correct answer, imo, is this, reproduced below:

while IFS= read line; do
    check=${line:0:1}
done < file.txt

Note that it'll take care of situations where the input is piped from another command, and not just from an actual file.

Note that you can also simplify the redirection as shown below.

#!/bin/bash
OUTPUT="../best_practices.textile"
FILES="../best-practices/*.textile"
for f in "$FILES"
do
  echo "Processing $f file..."
  {
  echo

  while IFS= read line; do 
      echo "$line"
  done < $f
  echo
  echo;
  } > $OUTPUT
done

回复收藏 0 原文

~没有更多了~