Emacs regex wordWord边界(特别是关于下划线)
我正在尝试使用 Mx Replace-regexp 替换 emacs 上所有出现的整个单词(例如 foo)。
问题是我不想替换下划线单词中出现的 foo,例如 word_foo_word
如果我使用 \bfoo\b 来匹配 foo 那么它将匹配下划线字符串;因为据我了解,emacs 认为下划线是单词边界的一部分,这与 Perl 等其他 RegEx 系统不同。
正确的方法是什么?
I am trying to replace all occurrences of a whole word on emacs (say foo) using M-x replace-regexp.
The problem is that I don't want to replace occurrences of foo in underscored words such as word_foo_word
If I use \bfoo\b to match foo then it will match the underscored strings; because as I understand emacs considers underscores to be part of word boundaries, which is different to other RegEx systems such as Perl.
What would be the correct way to proceed?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
仅当正则表达式
\
或\bfoo\b
匹配foo
时,它的前面或后面没有单词组成字符 ( 语法代码w
,通常是字母数字,因此它在foo_bar
中匹配,但在foo1
中不匹配)。从 Emacs 22 开始,正则表达式
\_
仅当foo_bar
前后没有符号组成字符时才匹配。符号成分不仅包括单词成分(字母数字),还包括标识符中允许的标点符号,即大多数编程语言中的下划线。The regexp
\<foo\>
or\bfoo\b
matchesfoo
only when it's not preceded or followed by a word constituent character (syntax codew
, usually alphanumerics, so it matches infoo_bar
but not infoo1
).Since Emacs 22, the regexp
\_<foo_bar\_>
matchesfoo_bar
only when it's not preceded or followed by a symbol-constituent character. Symbol constituents include not only word constituents (alphanumerics) but also punctuation characters that are allowed in identifiers, meaning underscores in most programming languages.你写道:
。下划线的处理与 emacs 中的其他所有内容一样,是可配置的。这个问题:
如何制作前向单词、后向单词、将下划线视为单词的一部分?
...提出相反的问题。
我认为您可以通过更改语法表中下划线的语法来解决您的问题,使它们不是单词的一部分,然后进行搜索/替换。
为此,您需要知道正在使用的模式以及该模式的语法表的名称。在 C++ 中,它会是这样的:
点表示“标点符号”,这意味着不是单词的一部分。有关更多信息,请尝试
modify-syntax-entry
上的Mxdescribe-function
。You wrote:
The treatment of underscores, like everything else in emacs, is configurable. This question:
How to make forward-word, backward-word, treat underscore as part of a word?
...asks the converse.
I think you could solve your problem by changing the syntax of underscores in the syntax table so that they are not part of words, and then doing the search/replace.
To do that, you need to know the mode you are using, and the name of the syntax table for that mode. In C++, it would be like this:
The dot signifies "punctuation", which implies not part of a word. For more on that, try
M-x describe-function
onmodify-syntax-entry
.