如何使用 gsub() 精确替换字符串

发布于 2024-12-07 19:13:31 字数 360 浏览 1 评论 0原文

我有一个语料库： txt =“微电子图案内的图案层。” 我想用“形式”完全替换术语“模式”，我尝试编写代码：

txt_replaced = gsub("pattern","form",txt)

但是，txt_replaced 中的响应语料库是： “微电子形式内的形成层。”

正如您所看到的，术语“patterned”被错误地替换为“formed”，因为“patterned”中的部分特征与“pattern”相匹配。

我想询问是否可以使用 gsub() 精确替换字符串？也就是说，只有完全匹配的术语才应该被替换。

我渴望得到如下回应： “微电子形式内的图案层。”

非常感谢！

原文

I have a corpus:
txt = "a patterned layer within a microelectronic pattern."
I would like to replace the term "pattern" exactly by "form", I try to write a code:

txt_replaced = gsub("pattern","form",txt)

However, the responsed corpus in txt_replaced is:
"a formed layer within a microelectronic form."

As you can see, the term "patterned" is wrongly replaced by "formed" because parts of characteristics in "patterned" matched to "pattern".

I would like to query that if I can replace the string exactly using gsub()?
That is, only the term with exactly match should be replaced.

I thirst for a responsed as below:
"a patterned layer within a microelectronic form."

Many thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

ゃ人海孤独症 2024-12-14 19:13:31

正如@koshke 指出的，（我）之前已经回答过一个非常类似的问题。 ...但是那是 grep 而这是 gsub，所以我会再次回答它：

“\<”是单词开头的转义序列，“>”是结束。在 R 字符串中，您需要将反斜杠加倍，因此：

txt <- "a patterned layer within a microelectronic pattern."
txt_replaced <- gsub("\\<pattern\\>","form",txt)
txt_replaced
# [1] "a patterned layer within a microelectronic form."

或者，您可以使用 \b 而不是 \< 和 \>。 \b 匹配单词边界，因此两端都可以使用>

txt_replaced <- gsub("\\bpattern\\b","form",txt)

另请注意，如果您只想替换 1 个匹配项，则应使用 sub 而不是 gsub。

As @koshke noted, a very similar question has been answered before (by me). ...But that was grep and this is gsub, so I'll answer it again:

"\<" is an escape sequence for the beginning of a word, and ">" is the end. In R strings you need to double the backslashes, so:

txt <- "a patterned layer within a microelectronic pattern."
txt_replaced <- gsub("\\<pattern\\>","form",txt)
txt_replaced
# [1] "a patterned layer within a microelectronic form."

Or, you could use \b instead of \< and \>. \b matches a word boundary so it can be used at both ends>

txt_replaced <- gsub("\\bpattern\\b","form",txt)

Also note that if you want to replace only ONE occurrence, you should use sub instead of gsub.

回复收藏 0 原文

~没有更多了~

关于作者

绝對不後悔。

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

如何使用 gsub() 精确替换字符串

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如何使用 gsub() 精确替换字符串

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。