为什么我的 Bash 脚本添加到文件的开头？

发布于 2024-08-16 09:37:33 字数 923 浏览 13 评论 0原文

我编写了一个脚本，使用 sed 清理 .csv 文件，删除一些错误的逗号和错误的引号（不好的，意味着它们破坏了我们用来转换这些文件的内部程序）：

# remove all commas, and re-insert the good commas using clean.sed
sed -f clean.sed $1 > $1.1st

# remove all quotes
sed 's/\"//g' $1.1st > $1.tmp

# add the good quotes around good commas
sed 's/\,/\"\,\"/g' $1.tmp > $1.tmp1

# add leading quotes
sed 's/^/\"/' $1.tmp1 > $1.tmp2

# add trailing quotes
sed 's/$/\"/' $1.tmp2 > $1.tmp3

# remove utf characters
sed 's/<feff>//' $1.tmp3 > $1.tmp4

# replace original file with new stripped version and delete .tmp files
cp -rf $1.tmp4 quotes_$1

这是 clean.sed：

s/\",\"/XXX/g;
:a
s/,//g
ta
s/XXX/\",\"/g;

然后它删除了临时文件和中提琴我们有一个以单词“quotes”开头的新文件，我们可以将其用于其他进程。

我的问题是：
为什么我必须制作 sed 语句来删除该临时文件中的 feff 标记？原始文件没有它，但它总是出现在替换文件中。起初我以为 cp 导致了这个问题，但是如果我在 cp 之前放入要删除的 sed 语句，则它不存在。

也许我只是错过了一些东西......

原文

I've written a script that cleans up .csv files, removing some bad commas and bad quotes (bad, means they break an in house program we use to transform these files) using sed:

# remove all commas, and re-insert the good commas using clean.sed
sed -f clean.sed $1 > $1.1st

# remove all quotes
sed 's/\"//g' $1.1st > $1.tmp

# add the good quotes around good commas
sed 's/\,/\"\,\"/g' $1.tmp > $1.tmp1

# add leading quotes
sed 's/^/\"/' $1.tmp1 > $1.tmp2

# add trailing quotes
sed 's/$/\"/' $1.tmp2 > $1.tmp3

# remove utf characters
sed 's/<feff>//' $1.tmp3 > $1.tmp4

# replace original file with new stripped version and delete .tmp files
cp -rf $1.tmp4 quotes_$1

Here is clean.sed:

s/\",\"/XXX/g;
:a
s/,//g
ta
s/XXX/\",\"/g;

Then it removes the temp files and viola we have a new file that starts with the word "quotes" that we can use for our other processes.

My question is:
Why do I have to make a sed statement to remove the feff tag in that temp file? The original file doesn't have it, but it always appears in the replacement. At first I thought cp was causing this but if I put in the sed statement to remove before the cp, it isn't there.

Maybe I'm just missing something...

分享到QQ

分享到微博