如何使用尴尬将替代位置替换
我有一个查找文件,用于搜索文件_2中的可用记录,如果存在此类记录,请用#替换这些记录。目前,我的代码用#代替了整个记录,但我需要部分替换它。 我想用#替换字符串的每两个字符。我该怎么办?您的帮助将不胜感激。谢谢
代码
awk ' NR==FNR {
s = $0;
gsub("[A-Za-z0-9]","#");
a[s] = $0;
next
}
{
if match($0, ">[^<]+"))
{
str = substr($0, RSTART+1, RLENGTH-1)
if (str in a )
{
$0 = substr($0, 1, RSTART) a[str] substr($0, RSTART+RLENGTH)
}
}
lines[FNR]=$0
}
END {for (i=1;i<=FNR;i++)
{
for (str in a )
{
regex = "\\<" str "\\>"
gsub(regex,a[str],lines[I])
}
}' lookup file_1 > file_2
猫查找
CDX98XSD
@vanti Finserv Co.
11:11 - Capital
MS&CO(NY)
MS&CO(NY)
MS&CO(NY)
cat file_1
<html>
<body>
<hr><br><>span class="table">Records</span><table>
<tr class="data">
<td>@vanti Finserv Co.</td>
<td>11:11 - Capital</td>
<td>MS&CO(NY)</td>
<td>New York</td>
<td>CDX98XSD</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="data">
<td>@vanti Finserv Co.</td>
<td></td>
<td>MS&CO(NY)</td>
<td>2</td>
<td>2</td>
<td>MS&CO(NY)</td>
<td>MS&CO(NY)</td>
<td></td>
</table>
</body>
</html>
预期输出
<html>
<body>
<hr><br><>span class="table">Records</span><table>
<tr class="data">
<td>@##n## F##s##v C##</td>
<td>1##11 - C##I##l</td>
<td>M##C##N##</td>
<td>New York</td>
<td>C##9##S#</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="data">
<td>@##n## F##s##v C##</td>
<td></td>
<td>M##C##N##</td>
<td>2</td>
<td>2</td>
<td>M##C##N##</td>
<td>M##C##N##</td>
<td></td>
</table>
</body>
</html>
I have a lookup file that I use to search the available records in file_2 and if such records are present then replace those records with #. Currently my code is substituting the entire record with # but I need to partially substitute it.
I want to replace every two characters of the string with #. How can I do so? Your help will be much appreciated. Thanks
code
awk ' NR==FNR {
s = $0;
gsub("[A-Za-z0-9]","#");
a[s] = $0;
next
}
{
if match($0, ">[^<]+"))
{
str = substr($0, RSTART+1, RLENGTH-1)
if (str in a )
{
$0 = substr($0, 1, RSTART) a[str] substr($0, RSTART+RLENGTH)
}
}
lines[FNR]=$0
}
END {for (i=1;i<=FNR;i++)
{
for (str in a )
{
regex = "\\<" str "\\>"
gsub(regex,a[str],lines[I])
}
}' lookup file_1 > file_2
cat lookup
CDX98XSD
@vanti Finserv Co.
11:11 - Capital
MS&CO(NY)
MS&CO(NY)
MS&CO(NY)
cat file_1
<html>
<body>
<hr><br><>span class="table">Records</span><table>
<tr class="data">
<td>@vanti Finserv Co.</td>
<td>11:11 - Capital</td>
<td>MS&CO(NY)</td>
<td>New York</td>
<td>CDX98XSD</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="data">
<td>@vanti Finserv Co.</td>
<td></td>
<td>MS&CO(NY)</td>
<td>2</td>
<td>2</td>
<td>MS&CO(NY)</td>
<td>MS&CO(NY)</td>
<td></td>
</table>
</body>
</html>
expected output
<html>
<body>
<hr><br><>span class="table">Records</span><table>
<tr class="data">
<td>@##n## F##s##v C##</td>
<td>1##11 - C##I##l</td>
<td>M##C##N##</td>
<td>New York</td>
<td>C##9##S#</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr class="data">
<td>@##n## F##s##v C##</td>
<td></td>
<td>M##C##N##</td>
<td>2</td>
<td>2</td>
<td>M##C##N##</td>
<td>M##C##N##</td>
<td></td>
</table>
</body>
</html>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
假设/理解:
查找
中的重复条目可以被忽略(即,查找>查找
中的每个白空间划定的字符串,我们不会替换替换该字符串nth/(n+1)带有#
的字符(其中n
= 2,5,8,11,14,17,20,....查找
字符串11:11- Capital
正确的替换字符串是1 ## 1#-C ## i ## l
(与OP的相对1 ## 11 -C ## i ## l
)在输入文件中添加了以下行(基于OP的注释):
一个
awk
构想:此生成:
专注于差异:
结束{...}
block生成:Assumptions/Understandings:
lookup
can be ignored (ie, we don't treat duplicate occurrences differently)lookup
we want to replace the nth/(n+1)th characters with#
(wheren
= 2,5,8,11,14,17,20,....)lookup
string11:11 - Capital
the correct replacement string is1##1# - C##i##l
(as opposed to OP's1##11 - C##i##l
)Added following lines to input files (based on comment from OP):
One
awk
idea:This generates:
Focusing on just the differences:
Uncommenting the
END{...}
block generates: