Ruby 的字符串:转义和取消转义自定义字符
假设我说 £
字符是危险的,并且我希望能够保护和取消保护任何字符串。反之亦然。
示例 1:
"Foobar £ foobar foobar foobar." # => dangerous string
"Foobar \£ foobar foobar foobar." # => protected string
示例 2:
"Foobar £ foobar £££££££foobar foobar." # => dangerous string
"Foobar \£ foobar \£\£\£\£\£\£\£foobar foobar." # => protected string
示例 3:
"Foobar \£ foobar \\£££££££foobar foobar." # => dangerous string
"Foobar \£ foobar \\\£\£\£\£\£\£\£foobar foobar." # => protected string
使用 Ruby,是否有一种简单的方法可以从字符串中转义(和取消转义)给定字符(例如我的示例中的 £
)?
编辑:这是有关此问题行为的说明。
首先,感谢您的回答。我有一个 Rails 应用程序,其中的 Tweet
模型具有 content
字段。推文示例:
tweet = Tweet.create(content: "Hello @bob")
在模型内部,有一个序列化过程,可以像这样转换字符串:
dump('Hello @bob') # => '["Hello £", 42]'
# ... where 42 is the id of bob username
然后,我可以像这样反序列化并显示其推文:
load('["Hello £", 42]') # => 'Hello @bob'
同样,也可以使用多个用户名来执行此操作:
dump('Hello @bob and @joe!') # => '["Hello £ and £!", 42, 185]'
load('["Hello £ and £!", 42, 185]') # => 'Hello @bob and @joe!'
这就是目标 :)
但是这种查找和替换可能很难用类似的东西来执行:
tweet = Tweet.create(content: "£ Hello @bob")
因为这里我们还必须转义 £
字符。我认为你的解决方案对此很有好处。所以结果就变成了:
dump('£ Hello @bob') # => '["\£ Hello £", 42]'
load('["\£ Hello £", 42]') # => '£ Hello @bob'
完美。 <3 <3
现在,如果有这样的情况:
tweet = Tweet.create(content: "\£ Hello @bob")
我认为我们首先应该转义每个 \
,然后转义每个 £
,例如:
dump('\£ Hello @bob') # => '["\\£ Hello £", 42]'
load('["\\£ Hello £", 42]') # => '£ Hello @bob'
但是...怎么可以我们在这种情况下这样做:
tweet = Tweet.create(content: "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\£ Hello @bob")
...where tweet.content.gsub(/(? 似乎不起作用。
Suppose I said £
character as dangerous, and I want to be able to protect and to unprotect any string. And vice versa.
Example 1:
"Foobar £ foobar foobar foobar." # => dangerous string
"Foobar \£ foobar foobar foobar." # => protected string
Example 2:
"Foobar £ foobar £££££££foobar foobar." # => dangerous string
"Foobar \£ foobar \£\£\£\£\£\£\£foobar foobar." # => protected string
Example 3:
"Foobar \£ foobar \\£££££££foobar foobar." # => dangerous string
"Foobar \£ foobar \\\£\£\£\£\£\£\£foobar foobar." # => protected string
Is there an easy way, with Ruby, to escape (and unescape) a given character (such as £
in my example) from a string?
Edit: here is an explication about the behavior of this question.
First of all, thanks for your answers. I have a Rails app with a Tweet
model having a content
field. Example of tweet:
tweet = Tweet.create(content: "Hello @bob")
Inside the model, there's a serialization process that converte the string like this:
dump('Hello @bob') # => '["Hello £", 42]'
# ... where 42 is the id of bob username
Then, I'm able to deserialize and display its tweet like this:
load('["Hello £", 42]') # => 'Hello @bob'
In the same way, it's also possible to do so with more than one username:
dump('Hello @bob and @joe!') # => '["Hello £ and £!", 42, 185]'
load('["Hello £ and £!", 42, 185]') # => 'Hello @bob and @joe!'
That's the goal :)
But this find-and-replace could be hard to perform with something like:
tweet = Tweet.create(content: "£ Hello @bob")
'cause here we also have to escape £
char. And I think your solution is good for this. So the result become:
dump('£ Hello @bob') # => '["\£ Hello £", 42]'
load('["\£ Hello £", 42]') # => '£ Hello @bob'
Just perfect. <3 <3
Now, if there is this:
tweet = Tweet.create(content: "\£ Hello @bob")
I think we first should escape every \
, and then escape every £
, like:
dump('\£ Hello @bob') # => '["\\£ Hello £", 42]'
load('["\\£ Hello £", 42]') # => '£ Hello @bob'
However... how can we do in this case:
tweet = Tweet.create(content: "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\£ Hello @bob")
...where tweet.content.gsub(/(?<!\\)(?=(?:\\\\)*£)/, "\\")
seems not working.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
希望您的 ruby 版本支持lookbehinds。如果不是这样,我的解决方案将不适用于您。
转义字符:
非转义字符:
无论反斜杠的数量如何,两个正则表达式都将起作用。他们是相辅相成的。
转义解释:
并不是说我匹配某个位置。根本不消耗任何文本。当我精确定位我想要的位置时,我插入一个\。
unescape 的解释:
这里我保存所有反斜杠减一,并用特殊字符替换这个数量的反斜杠。棘手的事情:)
Hopefully your version of ruby supports lookbehinds. If it doesn't my solution will not work for you.
Escape characters :
Un-escape characters :
Both regexes will work regardless of the amount of backslashes. They are complementing each other.
Escape explanation :
Not that I am matching a certain position. No text is consumed at all. When I pinpoint the position I want I insert a \.
Explanation of unescape :
Here I am saving all the backslashes minus one and and I replace this number of backslashes with the special character. Tricky stuff :)
如果您使用的是具有lookbehind功能的Ruby 1.9,那么FailedDev的答案应该可以很好地工作。如果您使用的是 Ruby 1.8,它没有lookbehind(我认为),则可能会使用不同的方法。尝试一下:
请注意,我不是 Ruby 程序员,并且此代码段未经测试(特别是我不确定:
if ($1 != nil)
语句的用法是否正确 - 它可能需要是:if ($1 != "")
或if ($1)
),但我确实知道这种通用技术(使用代码代替简单的替换)字符串)有效。我最近在我对类似问题的 JavaScript 解决方案中使用了相同的技术它正在寻找未转义的星号。If you are using Ruby 1.9, which has lookbehind, then FailedDev's answer should work quite well. If you are using Ruby 1.8, which does not have lookbehind (I think), a different approach may work. Give this a try:
Note that I am not a Ruby programmer and this snippet is untested (in particular I'm not sure if the:
if ($1 != nil)
statement usage is correct - it may need to be:if ($1 != "")
orif ($1)
), but I do know that this general technique (using code in place of a simple replacement string) works. I recently used this same technique for my JavaScript solution to a similar question which was looking to find unescaped asterisks.我不确定这是否是您想要的,但我认为您可以执行简单的查找和替换:
请注意,我将
\
更改为\\
因为您必须转义双引号字符串中的反斜杠。编辑:我认为你想要的是一个匹配奇数个反斜杠的正则表达式:
它会执行以下转换
I'm not sure if this is what you want, but I think you can do a simple find-and-replace:
Note that I changed
\
to\\
because you have to escape the backslash in a double-quoted string.Edit: I think what you want is a regex that matches an odd number of backslashes:
That does the following transformations